Speaker 1: Prem Natarajan
Thursday 9 September, 13:00-14:00
Room “Rome”, second floor
OCR: A Journey through Advances in the Science, Engineering, and Productization of AI/ML
From the very early years of AI, the problem of optical character recognition (OCR) has captured the imagination of researchers; Selfridge and Neisser presented an approach for OCR of hand printed characters in 1960. During last three decades, optical character recognition (OCR) technology for machine printed and handwritten text has evolved in significant ways – from script-specific techniques to script-independent methodologies, and from segmentation-based techniques to hidden Markov models to deep learning. In my talk, I will present my perspective on that evolution and it’s interplay with concomitant advances in speech recognition, natural language processing, and computer vision. The presentation will include a discussion of some practical, even if off the beaten path, applications of OCR technology, including work done in partnership with the census bureau in applying a deep learning based OCR framework to census forms. I will also share my views on some of the most interesting open problems in the field of OCR and document processing. The presentation will conclude with a few comments about one of my current areas of research interests – fairness in AI and machine learning.
Prem Natarajan is a Vice President at Amazon where he leads research and engineering efforts in dialog systems, natural language understanding, and multimodal/multimedia technologies in the Alexa AI organization. Prior to joining Amazon in June 2018, he was with the University of Southern California (USC) where he was Senior Vice Dean of Engineering in the Viterbi School of Engineering, Executive Director of the Information Sciences Institute (a 300-person R&D organization), and Research Professor of computer science with distinction. Prior to that, as Executive VP and Principal Scientist at Raytheon BBN Technologies, he led the speech, language, and multimedia business unit, which included research and development operations, and commercial products for real-time multimedia monitoring, document analysis, and information extraction. During his tenure at USC and at BBN, Natarajan directed R&D efforts in speech recognition, natural language processing, computer vision, and other applications of machine learning. While at USC, he led nationally influential DARPA and IARPA sponsored research efforts in biometrics/face recognition, OCR, NLP, media forensics, and forecasting. Most recently at Amazon, he helped to launch the Fairness in AI (FAI) program – a collaborative effort between NSF and Amazon for funding fairness focused research efforts in US Universities.
Speaker 2: Beáta Megyesi
Friday 10 September, 09:00-10:00
Room “Rome”, second floor
Cracking Ciphers with “AI-in-the-loop”: Transcription and Decryption in a Cross-Disciplinary Field
Accurate transcription of hand-written texts in images is indispensable in many research areas in digital humanities. Manual transcription is error-prone, time-consuming, and expensive to produce. Historical texts with their specific textual qualities require expert knowledge and trained eyes. During the past years, image processing applied to hand-written historical text documents to provide transcription output has been shown great opportunities, but also challenges for users. How can users without knowledge in AI in general and HTR in particular transcribe hand-written documents efficiently with ”AI-in-the- loop”?
In my talk, I will focus on encrypted manuscripts from Early Modern times with various symbols systems, hand-writing styles, and languages. The point of departure is the DECRYPT project, aiming at the creation of resources and tools for historical cryptology by bringing the expertise of various disciplines together for collecting images of ciphers and keys, to transcribe them, and to decrypt and contextualize those. I will give an overview of the project, the methods we use to solve various problems from transcription to decryption including historical corpora and natural language processing methods.
Beáta Megyesi is a professor of computational linguistics at Uppsala University, the former head of department at the Dept. Linguistics and Philology, Uppsala University, Sweden, and the current president of the North European Association for Language Technology (NEALT). She is specialized in digital philology and natural language processing with a special interest in the automatic analysis of non-standard, noisy language data, from text produced by language learners to historical texts and encrypted documents. She has been participating in ten externally funded, cross-disciplinary research projects and currently serves as the PI of the DECRYPT project, financed by the Swedish Research Council (grant 2018-06074). Bea received her Ph.D. in speech communication from the Royal Institute of Technology (KTH) in Stockholm, Sweden.
Copyright © ICDAR 2021 Organizing Committee