摘要 |
PROBLEM TO BE SOLVED: To obtain an information retrieval unit for calculating a similarity degree which reflects relation between keywords and improving precision in classification or retrieval. SOLUTION: The unit is provided with a document database 10 storing multiple kinds of document data, a vector generating means 20 for generating the feature vector of the keyword concerning each kind of document data, a classifying means 30 for calculating the similarity degree between the feature vectors and classifying document data and an output means 40 for outputting the classification result of document data. The vector generating means 20 analyzes the respective kinds of document data, extracts the keywords and relation between the keywords and generates the feature vector based on the appearance frequency of the both.
|