Loading…
Representatives from CNI member organizations gather twice annually to explore new technologies, content, and applications; to further collaboration; to analyze technology policy issues, and to catalyze the development and deployment of new projects. Each member organization may send two representatives. Visit https://www.cni.org/mm/spring-2019 for more information.
Twitter: #cni19s
Monday, April 8 • 2:30pm - 3:15pm
1.3 A Research Agenda for Historical and Multilingual OCR

Sign up or log in to save this to your schedule and see who's attending!

Feedback form is now closed.
This talk will outline the primary findings and recommendations of a report written for The Andrew W. Mellon Foundation that seeks to describe the current state of optical character recognition (OCR) for large-scale humanities collections and suggest the most fruitful avenues for future research in this domain. The report surveys the current state of OCR for historical documents and recommends concrete steps that researchers, implementers, and funders can take to make progress improving the quality and use of OCR collections over the next five to ten years. We find, for instance, that advances in artificial intelligence for image recognition, natural language processing, and machine learning will drive significant progress in this area. More importantly, however, we describe how sharing goals, techniques, and data among researchers in computer science, in book and manuscript studies, and in library and information sciences will open up exciting new problems and allow a broad community, including cohorts who rarely collaborate, to allocate resources and measure progress in improving OCR for historical typography and multilingual documents. This presentation will briefly outline the report's findings about the current state of the art for humanistic OCR, but will devote the majority of his talk to detailing the report's nine primary recommendations for future, collaborative OCR research.

Speakers
RC

Ryan Cordell

Associate Professor of English, Northeastern University


Monday April 8, 2019 2:30pm - 3:15pm
Director's

Twitter Feed