摘要 |
<p>A computer (14) of a document analyzing device (10) sequentially creates a text corpus Ct from language materials increasing in time series at step S3, adds article information by decomposing the text data into morphemes at step S5, excludes unnecessary morphemes according to the article information at step S7, and calculates time-increasing TFIDF for each morpheme at step S11, calculates the sum (STF) of TFs up to the current corpus and the sum (S time-increasing TFIDF) of the time-increasing TFIDF at step S13 and performs residual analysis of the S time-increasing TFIDF (actually measured value) of the current corpus with the regression curve having been created in the previous corpus processing at step S17. The morpheme having a large positive residual value is selected as a unique word and that having a small negative residual value is selected as a common word.</p> |