摘要 |
<P>PROBLEM TO BE SOLVED: To automatically set an appropriate break between words for document data which is an object of text mining. <P>SOLUTION: This feature word extraction device comprises; a document storage part storing a plurality of pieces of document data; a generation part dividing each of clauses in a first piece of document data among the plurality of pieces of document data as changing a break position and a number of breaks, and storing character strings obtained by the dividing process in a data storage part; a calculation part calculating a feature degree for each of the character strings stored in the data storage part, by using an appearance frequency of the character string in the first document data and a number of pieces of document data in which the character string appears among the plurality of pieces of document data stored in the document storage part; and a specification part specifying a character string whose feature degree is the highest among character strings for a clause, for each of the clauses in the first document data, and storing it in a feature word storage part. <P>COPYRIGHT: (C)2012,JPO&INPIT |