发明名称 SYSTEM AND METHOD FOR DISAMBIGUATING TEXT LABELING CONTENT OBJECTS
摘要 An improved system and method for disambiguating text strings labeling content objects is provided. A text string set may be received from a user. Frequencies of co-occurring text strings in a text collection may be obtained, and a disambiguation measure may be determined for a pair of text strings that each co-occur with a text string in the text string set. The disambiguation measure may be based on a weighted KL divergence of text string distributions that maximizes the value of divergence when a text string set may occur in different contexts. A disambiguation measure may be determined for a list of the top most common pairs of text strings that co-occur with the text string set, and the pairs of text strings may be output in decreasing order by disambiguation measure for those pairs of text strings with a disambiguation measure that exceeds a threshold.
申请公布号 US2009327877(A1) 申请公布日期 2009.12.31
申请号 US20080164039 申请日期 2008.06.28
申请人 YAHOO! INC. 发明人 SLANEY MALCOLM;WEINBERGER KILIAN QUIRIN;VAN ZWOL ROELOF
分类号 G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址