发明名称 Method for extracting name entities and jargon terms using a suffix tree data structure
摘要 A method for entity name and jargon term recognition and extraction. An embodiment of the present invention uses a suffix tree data structure to determine frequently occurring phrases. In one embodiment text to be analyzed is preprocessed. The text is then separated into clauses and a suffix tree is created for the text. The suffix tree is used to determine repetitious segments. Unrecognized text fragment, occurring with a high frequency, have a comparably high probability of being a name entity or jargon term. The set of repetitious segments is then filtered to obtain a set of possible entity names and jargon terms.
申请公布号 US2003083862(A1) 申请公布日期 2003.05.01
申请号 US20010017408 申请日期 2001.10.30
申请人 HU ZENGJIAN;ZHANG YIMIN;ZHOU JOE F. 发明人 HU ZENGJIAN;ZHANG YIMIN;ZHOU JOE F.
分类号 G06F17/27;(IPC1-7):G06F17/27 主分类号 G06F17/27
代理机构 代理人
主权项
地址