主权项 |
1. An equivalence determination system comprising:
a processor; an object extracting unit, executed on the processor, that extracts, from respective electronic documents in a set of electronic documents, at least one object which forms the electronic document and includes at least one of a text, a figure, and an equation; a specifying unit that specifies predetermined number of objects in the respective electronic documents based on density calculated by referring to the extracted objects; and a judging unit that judges that plural electronic documents are similar based on the specified objects, wherein said specifying unit calculates the density of a given object, from among the objects, based on an area of the given object, and an amount of character string and decoration information contained in the given object, wherein said specifying unit calculates a weighted density of the given object by selecting objects having a first distance or smaller from the given object, and adding a (i) sum value of inverse proportion to each distance from the given object to the selected objects to (ii) the density of the object. |