发明名称 Evaluating distinctiveness of document
摘要 Two document sets are compared in natural language processing and the distinctiveness of each constituent element (such as a sentence, term or phrase) of one document set is evaluated by dividing both the target and comparison documents into document segments, constructing the sentence vector of each document segment whose components are the occurring frequencies of terms occurring in the document segment, and projecting all the sentence vectors of both the documents on a projection axis to find a projection axis which maximizes a ratio equal to: (squared sum of projected values originating from the target document)/(squared sum of projected values originating from the comparison document). Projected values are obtained by projecting the sentence vectors on the projection axis, and the degrees of distinctiveness of the individual sentences of the target document are calculated on the basis of the projected values.
申请公布号 US2004006736(A1) 申请公布日期 2004.01.08
申请号 US20030460469 申请日期 2003.06.13
申请人 KAWATANI TAKAHIKO 发明人 KAWATANI TAKAHIKO
分类号 G06F17/21;G06F17/27;G06F17/28;G06F17/30;G06K9/62;(IPC1-7):G06F15/00 主分类号 G06F17/21
代理机构 代理人
主权项
地址