Speaker 1: Prem Natarajan
OCR: A Journey through Advances in the Science, Engineering, and Productization of AI/ML
From the very early years of AI, the problem of optical character recognition (OCR) has captured the imagination of researchers; Selfridge and Neisser presented an approach for OCR of hand printed characters in 1960. During last three decades, optical character recognition (OCR) technology for machine printed and handwritten text has evolved in significant ways – from script-specific techniques to script-independent methodologies, and from segmentation-based techniques to hidden Markov models to deep learning. In my talk, I will present my perspective on that evolution and it’s interplay with concomitant advances in speech recognition, natural language processing, and computer vision. The presentation will include a discussion of some practical, even if off the beaten path, applications of OCR technology, including work done in partnership with the census bureau in applying a deep learning based OCR framework to census forms. I will also share my views on some of the most interesting open problems in the field of OCR and document processing. The presentation will conclude with a few comments about one of my current areas of research interests – fairness in AI and machine learning.
Prem Natarajan is a Vice President at Amazon where he leads research and engineering efforts in dialog systems, natural language understanding, and multimodal technologies in the Alexa AI organization. Prior to joining Amazon in June 2018, he was with the University of Southern California in Los Angeles as the Keston Executive Director of the Information Science Institute, Vice Dean of Engineering in the Viterbi School, and a Research Professor with Distinction in the Computer Science Department. For nearly two decades before that, he was at BBN Technologies where his most recent roles were Principal Scientist and Executive VP for Speech, Language, and Multimedia processing. Prem received his Bachelor of Electrical Engineering degree from Pune University in India, and his M.S. and Ph.D. M.E. degrees in electrical engineering from Tufts University in Massachusetts.
Speaker 2: Beáta Megyesi
Cracking Ciphers with “AI-in-the-loop”: Transcription and Decryption in a Cross-Disciplinary Field
Accurate transcription of hand-written texts in images is indispensable in many research areas in digital humanities. Manual transcription is error-prone, time-consuming, and expensive to produce. Historical texts with their specific textual qualities require expert knowledge and trained eyes. During the past years, image processing applied to hand-written historical text documents to provide transcription output has been shown great opportunities, but also challenges for users. How can users without knowledge in AI in general and HTR in particular transcribe hand-written documents efficiently with ”AI-in-the- loop”?
In my talk, I will focus on encrypted manuscripts from Early Modern times with various symbols systems, hand-writing styles, and languages. The point of departure is the DECRYPT project, aiming at the creation of resources and tools for historical cryptology by bringing the expertise of various disciplines together for collecting images of ciphers and keys, to transcribe them, and to decrypt and contextualize those. I will give an overview of the project, the methods we use to solve various problems from transcription to decryption including historical corpora and natural language processing methods.
Beáta Megyesi is a professor of computational linguistics at Uppsala University, the former head of department at the Dept. Linguistics and Philology, Uppsala University, Sweden, and the current president of the North European Association for Language Technology (NEALT). She is specialized in digital philology and natural language processing with a special interest in the automatic analysis of non-standard, noisy language data, from text produced by language learners to historical texts and encrypted documents. She has been participating in ten externally funded, cross-disciplinary research projects and currently serves as the PI of the DECRYPT project, financed by the Swedish Research Council (grant 2018-06074). Bea received her Ph.D. in speech communication from the Royal Institute of Technology (KTH) in Stockholm, Sweden.
Copyright © ICDAR 2021 Organizing Committee