摘要 |
PURPOSE:To enable high-speed and high-precision character segmentation by extracting only a sonant mark and a semi-sonant mark from a character string composed of the square from of Japanese syllabary (KANA) and temporarily removing those extracted marks so that those marks can not be erroneously integrated with adjacent characters. CONSTITUTION:A link pattern is extracted by labelling 1, and extracted linked character patterns are temporarily uintegrated by a circumscribed rectangule temporarily integrating means 2. Concerning this character pattern, the average size of circumscribed rectangles is calculated by an average character size calulating means 3. A small rectangle extracting means 4 extracts small rectangles to be the candidates of the sonant and semi-sonant marks. The characters sounding 'u', 'shi' and 'tsu' are extracted by an extracting means 9 for 'u' and an extracting means 10 for 'shi' and 'tsu'. Among the sonant mark candidates extracted by the sonant mark candidate extracting means 5, the candidates not uintegrated to 'u', 'shi' and 'tsu' and the candidates defined as semi-sonant mark candidates by semi-sonant mark extracting means 6 and 8 are temporarily removed by adding a prescribed mark with a temporary removing means for sonant and semi-sonant marks. |