摘要 |
<P>PROBLEM TO BE SOLVED: To extract hierarchized keywords by selecting characteristic words in a plurality of documents as keywords of an upper hierarchy and to appropriately narrow down keywords of each hierarchy. <P>SOLUTION: Keywords are extracted from each document, and a cumulative number of counts is taken, which is the number of documents containing the keywords. Further, if the same combination of keywords appears in a plurality of documents, they are virtually deemed to constitute one document, and the number of unique counts is taken, which is the number of documents containing the keywords. Single words that are ranked high both in the cumulative number of counts and in the unique number of counts are extracted as keywords in the highest hierarchy, and compound words that are ranked high either in the cumulative number of counts or in the unique number of counts are extracted as keywords in the second highest hierarchy. After being extracted, keywords in the highest hierarchy and keywords in the second highest hierarchy contained in the same document are associated with each other and hierarchized. <P>COPYRIGHT: (C)2012,JPO&INPIT |