<p>Presented is a method of extracting named entities from a large-scale document corpus. The method includes identifying named entities in the corpus and forming a set of seed entities manually or automatically using some existing resources, constructing a named entity graph to discover same-type probability between any given pair of named entities, expanding the set of seed entities and performing a confidence propagation of the seed entities on the named entity graph.</p>
申请公布号
WO2011134141(A1)
申请公布日期
2011.11.03
申请号
WO2010CN72235
申请日期
2010.04.27
申请人
HEWLETT-PACKARD DEVELOPMENT COMPANY,L.P.;YAO, CONG-LEI;XIONG, YUHONG;ZHENG, LI-WEI