发明名称 Acronym extraction system and method of identifying acronyms and extracting corresponding expansions from text
摘要 An acronym expansion system of the present invention receives electronic documents and extracts acronyms and their corresponding expansions. A part-of-speech tagger decomposes text into string tokens or words and tags them with their part-of-speech, while an acronym identifier determines whether a word is a potential acronym based on various conditions. An expansion identifier retrieves lists of words preceding and following a potential acronym to search for the expansion. The resulting word lists are examined sequentially to identify and retrieve an expansion for the potential acronym. An expansion extractor receives the potential acronym and a processed word list to retrieve the expansion of the potential acronym from that list. The extractor may utilize information from prior search iterations, and verifies an extracted expansion against a set of rules to remove spurious expansions.
申请公布号 US7236923(B1) 申请公布日期 2007.06.26
申请号 US20020212914 申请日期 2002.08.07
申请人 ITT MANUFACTURING ENTERPRISES, INC. 发明人 GUPTA KALYAN M
分类号 G06F17/30;G06F17/27;G06F17/28 主分类号 G06F17/30
代理机构 代理人
主权项
地址