发明名称 |
SYSTEMS AND METHODS FOR GENERATING CONCEPTS FROM A DOCUMENT CORPUS |
摘要 |
Systems and method for generating concepts from a document corpus are disclosed. In one embodiment, a method for generating concepts from a document includes retrieving, a plurality of terms stored within a first lexicon. The method further includes, for individual terms stored within the first lexicon: determining a first frequency of the term within the document corpus, and determining a second frequency of the term within a comparison document corpus including a plurality of comparison documents, wherein the comparison document corpus is different from the document corpus. The method further includes, for individual terms within the first lexicon: determining a difference between the first frequency and the second frequency, comparing the difference between the first frequency and the second frequency to a comparison metric, and, when the difference between the first frequency and the second frequency satisfies the comparison metric, storing the term as a concept within a second lexicon. |
申请公布号 |
WO2016172288(A1) |
申请公布日期 |
2016.10.27 |
申请号 |
WO2016US28558 |
申请日期 |
2016.04.21 |
申请人 |
LEXISNEXIS, A DIVISION OF REED ELSEVIER INC. |
发明人 |
ZHANG, Paul;SHARMA, Sanjay;STEINER, David;WASSON, Mark, David;SILVER, Harry, R.;WARLING, Robin |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|