发明名称 Distinguishing facts from opinions using a multi-stage approach
摘要 Facts are extracted from electronic documents by recognizing factual descriptions using a fact-word table to match to words of the electronic documents. The words of those factual descriptions may be tagged with the appropriate part of speech. More detailed analysis is then performed on those factual descriptions, rather than on the entire electronic document, and particularly to the text in the neighborhood of the fact-word matches. The analysis may involve identifying the linguistic constituents of each phrase and determining the role as either subject or object. Exclusion rules may be applied to eliminate those phrases unlikely to be part of facts, the exclusion rules being based in part on the linguistic constituents. Scoring rules may be applied to remaining phrases, and for those phrases having a score in excess of a threshold, the corresponding sentence part, whole sentence, paragraph, or other document portion may be presented as representing one or more facts.
申请公布号 US7668791(B2) 申请公布日期 2010.02.23
申请号 US20060496650 申请日期 2006.07.31
申请人 MICROSOFT CORPORATION 发明人 AZZAM SALIHA;HUMPHREYS KEVIN WILLIAM
分类号 G06N5/00 主分类号 G06N5/00
代理机构 代理人
主权项
地址