发明名称 DOCUMENT ANALYZING DEVICE AND METHOD
摘要 <p>A computer (14) of a document analyzing device (10) sequentially creates a text corpus Ct from language materials increasing in time series at step S3, adds article information by decomposing the text data into morphemes at step S5, excludes unnecessary morphemes according to the article information at step S7, and calculates time-increasing TFIDF for each morpheme at step S11, calculates the sum (STF) of TFs up to the current corpus and the sum (S time-increasing TFIDF) of the time-increasing TFIDF at step S13 and performs residual analysis of the S time-increasing TFIDF (actually measured value) of the current corpus with the regression curve having been created in the previous corpus processing at step S17. The morpheme having a large positive residual value is selected as a unique word and that having a small negative residual value is selected as a common word.</p>
申请公布号 WO2008062910(A1) 申请公布日期 2008.05.29
申请号 WO2007JP73257 申请日期 2007.11.22
申请人 HAYASHI, HARUO 发明人 HAYASHI, HARUO
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址
您可能感兴趣的专利