摘要 |
Systems, apparatus and methods for extracting lower modifiers from a word image, before performing optical character recognition (OCR), based on a plurality of tests comprising a first test, a second test and a third test are presented. The method obtains the word image and performing a plurality of tests (e.g., a first test, a second test and a third test). The first test determines whether a vertical line spanning the height of the word image exists. The second test determines whether a jump of a number of components in the lower portion of the word image exists. The third test determines sparseness in a lower portion of the word image. The plurality of tests may run sequentially and/or in parallel. Results from the plurality of tests are used to decide whether a lower modifier exists by comparing and accumulating test results from the plurality of tests.
|