摘要 |
<P>PROBLEM TO BE SOLVED: To provide a method for determining image characteristics such as a language or language family of the text in an electronic document. <P>SOLUTION: According to one embodiment of the present invention, the method includes accumulating, for a plurality of glyph components contained in a predetermined region of image data, the appearance frequencies of the glyph components corresponding to the numbers of feature points in relation to the numbers of feature points as a frequency distribution having the numbers of feature points as bins, acquiring an object distribution feature from the frequency distribution, and comparing the object distribution feature with a sample distribution feature to identify the language or language family of the character. <P>COPYRIGHT: (C)2012,JPO&INPIT |