摘要 |
A process for digitizing newsprint information from a newspaper includes scanning the information into a digital image format and then processing the image to produce searchable text. The processing includes removing data stamps and other marks that are written over the newsprint, enhancing the image using a library of image processing functions, and performing voting-OCR to select an optimal OCR output. The OCR output yields highly accurate text which can be word searched using adaptive pattern recognition processing, fuzzy logic, morphology, and other techniques to provide a word searchable database of newsprint information from newspapers. The process is software controlled so that the work flow, both electronic and non-electronic, between various processes or stations is tracked and sequenced, and appropriate data is collected and stored.
|
申请人 |
PROGRESSIVE TECHNOLOGY FEDERAL SYSTEMS, INC.;YOKLEY, JOHN, R.;NISSEN, DON;SCHWARTZ, ERIK;KORNELE, BRYAN;LEE, ED;KAPEL, KEVIN |
发明人 |
YOKLEY, JOHN, R.;NISSEN, DON;SCHWARTZ, ERIK;KORNELE, BRYAN;LEE, ED;KAPEL, KEVIN |