摘要 |
<P>PROBLEM TO BE SOLVED: To allow extraction of an original document including a document performing proper quotations from the documents on the Internet. <P>SOLUTION: A document processor includes: substring generation means by which out of character strings included in a plurality of documents, a substring composing a part of this character string is generated for each document; unique substring determination means by which out of the substring generated by the substring generation means, an unique substring not included in the document other than the document generating the substring itself is determined; and unnecessary document detection means for detecting a document as an unnecessary document in which a ratio of the number of substring determined by the unique substring determination means to the total number of substring for each document is within a specified range. <P>COPYRIGHT: (C)2012,JPO&INPIT |