摘要 |
A document classification system and a document classification method are provided to classify target documents on the basis of interrelation among keywords by grouping plural keywords while considering interrelation rate of the keywords in accordance with frequency appearing together in the target documents. A document classification system(200) includes a keyword extractor(210), an interrelation rate calculator(220), a cluster generator(230) and a classification learning unit(240). The keyword extractor extracts plural keywords from target documents and identifies sequentially the extracted keywords as index keywords. The interrelation rate calculator calculates interrelation rate among the identified index keywords and the extracted keywords with exception of the index keywords. The cluster generator groups the index keywords whose interrelation rate is within an allowance value into a cluster. The classification learning unit classifies the target documents by using the generated cluster. Meanwhile, a cluster search engine(110) which is located inside or outside the document classification system discriminates classification keywords corresponding to inquiries inputted from the user and searches the object of documents related to the classified keywords.
|