发明名称 Systems and methods for authoritativeness grading, estimation and sorting of documents in large heterogeneous document collections
摘要 Systems and methods for determining the authoritativeness of a document based on textual, non-topical cues. The authoritativeness of a document is determined by evaluating a set of document content features contained within each document to determine a set of document content feature values, processing the set of document content feature values through a trained document textual authority model, and determining a textual authoritativeness value and/or textual authority class for each document evaluated using the predictive models included in the trained document textual authority model. Estimates of a document's textual authoritativeness value and/or textual authority class can be used to re-rank documents previously retrieved by a search, to expand and improve document query searches, to provide a more complete and robust determination of a document's authoritativeness, and to improve the aggregation of rank-ordered lists with numerically-ordered lists.
申请公布号 US2003225750(A1) 申请公布日期 2003.12.04
申请号 US20020232709 申请日期 2002.09.03
申请人 XEROX CORPORATION 发明人 FARAHAT AYMAN O.;CHEN FRANCINE R.;MATHIS CHARLES R.;NUNBERG GEOFFREY D.
分类号 G06F17/27;G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/27
代理机构 代理人
主权项
地址