发明名称 Processing text with domain-specific spreading activation methods
摘要 A method for performing natural language processing of free text using domain-specific spreading activation. Embodiments of the present invention ontologize free text using an algorithm based on neurocognitive theory by simulating human recognition, semantic, and episodic memory approaches. Embodiments of the invention may be used to process clinical text for assignment of billing codes, analyze suicide notes or legal discovery materials, and for processing other collections of text. Further, embodiments of the invention may be used to more effectively search large databases, such as a database containing a large number of medical publications.
申请公布号 US9477655(B2) 申请公布日期 2016.10.25
申请号 US201414553562 申请日期 2014.11.25
申请人 Children's Hospital Medical Center 发明人 Pestian John P.;Matykiewicz Pawel;Duch Wlodzislaw;Glauser Tracy A.;Kowatch Robert A.;Grupp-Phelan Jacqueline M.;Sorter Michael
分类号 G06F17/27;G06F17/28;G06Q50/22 主分类号 G06F17/27
代理机构 Mintz Levin Cohn Ferris Glovsky and Popeo, P.C. 代理人 Mintz Levin Cohn Ferris Glovsky and Popeo, P.C. ;Liberto, Esq. Muriel
主权项 1. One or more non-transitory electronic memory devices including computer instructions for performing a method comprising: using a central processing unit (CPU) connected via a network to a remote storage device, to process text documents stored in said memory device; identifying, using the CPU, one or more of a plurality of groups of characters of a text in the text document as corresponding to at least one of a plurality of known words; using the CPU for creating a list of the identified known words;querying a first database contained in a second memory device to obtain a set of one or more semantic concepts associated with each of the identified known words, the first database comprising associations between the plurality of known words and a plurality of semantic concepts; annotating, using the CPU, the list of identified known words with the first set of semantic concepts associated with each identified known word; querying a second database contained in a third memory device to obtain a set of one or more episodic concepts associated with the set of semantic concepts, the second database comprising associations between a plurality of episodic concepts and at least one of the plurality of known words and the plurality of semantic concepts, the plurality of episodic concepts being separate from the plurality of semantic concepts; creating, using the CPU, a semantic network having a plurality of nodes corresponding to the first and second sets of semantic and episodic concepts and weighted links between the first and second sets of semantic and episodic concepts; utilizing, using the CPU, spreading activation algorithms to refine the weighted links in the semantic network; and selecting, using the CPU, at least one of the concepts from the sets of semantic and episodic concepts based upon an associated weight for the at least one node derived from the step of utilizing spreading activation.
地址 Cincinnati OH US