发明名称 GENERATING DESCRIPTIVE TOPIC LABELS
摘要 A method to generate a topic label for a set of electronic documents may include crawling, by a processor, the set of electronic documents. The method may include extracting knowledge points from the set of electronic documents. The method may also include selecting a candidate set of knowledge points from the plurality of knowledge points based on occurrence values. The method may include calculating relatedness scores between each knowledge point in the candidate set of knowledge points. The method may also include calculating hierarchical relationships between each knowledge point in the candidate set. The method may further include calculating comprehensive scores for each knowledge point in the candidate set based on the relatedness scores and the hierarchical relationships. The method may include selecting, from the set of knowledge points, a first candidate knowledge point with the highest comprehensive score as a topic label for the set of electronic documents.
申请公布号 US2017103074(A1) 申请公布日期 2017.04.13
申请号 US201514880087 申请日期 2015.10.09
申请人 FUJITSU LIMITED 发明人 WANG Jun;UCHINO Kanji
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method comprising: crawling, by a processor, a set of electronic documents stored at least temporarily in a non-transitory storage media; extracting a plurality of knowledge points from the set of electronic documents; selecting a candidate set of knowledge points from the plurality of knowledge points based on occurrence values of the plurality of knowledge points in the set of electronic documents; calculating relatedness scores between each knowledge point in the candidate set of knowledge points; calculating hierarchical relationships between each knowledge point in the candidate set of knowledge points; calculating comprehensive scores for each knowledge point in the candidate set of knowledge points based on the relatedness scores and the hierarchical relationships; and selecting, from the set of candidate knowledge points, a first candidate knowledge point that has a highest comprehensive score as a topic label for the set of electronic documents.
地址 Kawasaki-shi JP