发明名称 Method for extracting knowledge from online documentation and creating a glossary, index, help database or the like
摘要 A method involving computer-mediated linguistic analysis of online technical documentation to extract and catalog from the documentation knowledge essential to, for example, creating a online help database useful in providing online assistance to users in performing a task. The method comprises stripping markup tags from the documentation, linguistically analyzing and annotating the text, including the steps of morphologically and lexically analyzing the text, disambiguating between possible parts-of-speech for each word, and syntactically analyzing and labeling each word. The method further comprises the steps of combining the linguistically analyzed, annotated, and labeled text and previously stripped markup information into a merged file, mining the merged file for domain knowledge, including the steps of identifying and creating a list of technical terminology, mining the merged file for manifestations of domain primitives and maintaining a list of manifestations of such domain primitives in an observations file, analyzing the discourse context of each sentence or phrase in the merged file, analyzing the frequency of manifestations of domain primitives in the observations file to determine those that are important, expanding the list of key terms by searching for terms sanctioned by a domain primitive deemed important in the previous step, and searching the merged file for larger relations by searching for particular lexico-syntactic patterns involving key terms and manifestations of domain primitives previously identified. The method further comprises the steps of structuring the knowledge thus mined and building a domain catalog.
申请公布号 US6212494(B1) 申请公布日期 2001.04.03
申请号 US19980119166 申请日期 1998.07.20
申请人 APPLE COMPUTER, INC. 发明人 BOGURAEV BRANIMIR K.
分类号 G06F17/27;(IPC1-7):G06F17/21;G06F17/30 主分类号 G06F17/27
代理机构 代理人
主权项
地址