摘要 |
<p><P>PROBLEM TO BE SOLVED: To provide a system and method for clustering a document image. <P>SOLUTION: The property of a mark extracted from a document is compared with the properties of an already existing cluster. When the property of the mark is not matched with any property of the already existing cluster, the mark is added to the existing cluster as a new cluster. One property is the x size and y size of the already existing cluster, and indicates the width and height of the already existing cluster. Another property is an ink size, and indicates the rate of the black pixel and total pixels in the cluster. In addition, another property is a reduction mark or an image, and indicates the pixel size reduction version of the mark and/or cluster. The property may be used to specify mis-matching, and to reduce the number of times of comparison to be carried out by each bit. <P>COPYRIGHT: (C)2004,JPO</p> |