摘要 |
PURPOSE:To provide a system capable of automatically extracting almost all key words by utilizing document data in plural fields and the appearance frequency of each word in each field. CONSTITUTION:Each document data (a) are divided into words by a document data word dividing part 1, the divided words are stored in a word division table (b) and the appearance frequency of each word in each document data is detected by a word appearance frequency detecting part 2 and registered in a word appearance frequency table (c). The appearance frequency of each word in each field is found out based upon the table (c) and registered in a field-sorted appearance word totalizing table by a field-sorted word appearance frequency totalizing part 3, a key word in each field is extracted from the document data by means of a key word unnecessary word dictionary) 4 based upon the appearance frequency of the totalizing table, an unnecessay word which is not a key word in each field is extracted by an unnecessary word extracting part 5, and the extracted key word in each field and the unnecessary word are registered in a key word/ unnecessary word dictionary (d). At the time of allocating a key word to document data, key words and unnecessary words stored in the dictionary (d) are referred to, a word to be a keyword is extracted from words in the document data to which the key word is to be allocated and the extracted key word is allocated to the document data by a key word allocating part 7.
|