发明名称 SYSTEMS AND METHODS FOR GENERATING CONCEPTS FROM A DOCUMENT CORPUS
摘要 Systems and method for generating concepts from a document corpus are disclosed. In one embodiment, a method for generating concepts from a document includes retrieving, a plurality of terms stored within a first lexicon. The method further includes, for individual terms stored within the first lexicon: determining a first frequency of the term within the document corpus, and determining a second frequency of the term within a comparison document corpus including a plurality of comparison documents, wherein the comparison document corpus is different from the document corpus. The method further includes, for individual terms within the first lexicon: determining a difference between the first frequency and the second frequency, comparing the difference between the first frequency and the second frequency to a comparison metric, and, when the difference between the first frequency and the second frequency satisfies the comparison metric, storing the term as a concept within a second lexicon.
申请公布号 WO2016172288(A1) 申请公布日期 2016.10.27
申请号 WO2016US28558 申请日期 2016.04.21
申请人 LEXISNEXIS, A DIVISION OF REED ELSEVIER INC. 发明人 ZHANG, Paul;SHARMA, Sanjay;STEINER, David;WASSON, Mark, David;SILVER, Harry, R.;WARLING, Robin
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址