Skip to content
Cliff Wulfman edited this page Feb 1, 2022 · 2 revisions

A few notes about this project.

It is exploratory. Developing a production-grade system is not the goal of this project. The success of the project will be measured by the questions it raises and answers, and the directions it suggests. This wiki will be used to track those questions, answers, and potential directions.

It is experimental. The project strives to be needs-driven, not technology driven. Nevertheless, we have started with the hypothesis that named-entity extraction from dirty OCR will yield information that enables archivists to discover people, places, and organizations that are not well represented in existing finding aids, so an exploration of those technologies is our first order of business.

Clone this wiki locally