摘要 |
PROBLEM TO BE SOLVED: To obtain a document classification managing method in which document contents are appropriately reflected or a document retrieving method which has a high efficiency even for a new document and a document including different classification systems. SOLUTION: Document parameters put together with tags of specified formats of N pieces of gathered documents are extracted. Then, the weight based upon the appearance frequency of components of an M-dimensional document parameter vector consisting of M pieces of document parameters extracted from the N documents is calculated to determine document parameter vectors by the documents. The similarities of document parameter vectors found by the documents are found, and the said gathered documents are classified and manages by document classes of the classification systems determined by the similarities. |