发明名称 METHODS FOR GENERATING NATURAL LANGUAGE PROCESSING SYSTEMS
摘要 Methods are presented for generating a natural language model. The method may comprise: ingesting training data representative of documents to be analyzed by the natural language model, generating a hierarchical data structure comprising at least two topical nodes within which the training data is to be subdivided into by the natural language model, selecting a plurality of documents among the training data to be annotated, generating an annotation prompt for each document configured to elicit an annotation about said document indicating which node among the at least two topical nodes said document is to be classified into, receiving the annotation based on the annotation prompt; and generating the natural language model using an adaptive machine learning process configured to determine patterns among the annotations for how the documents in the training data are to be subdivided according to the at least two topical nodes of the hierarchical data structure.
申请公布号 US2016162456(A1) 申请公布日期 2016.06.09
申请号 US201514964517 申请日期 2015.12.09
申请人 Munro Robert J.;Erle Schuyler D.;Walker Christopher;Luger Sarah K.;Brenier Jason;King Gary C.;Tepper Paul A.;Mechanic Ross;Gilchrist-Scott Andrew;Long Jessica D.;Robinson James B.;Callahan Brendan D.;Casbon Michelle;Sarin Ujjwal;Nair Aneesh;Basavaraj Veena;Saxena Tripti;Nunez Edgar;Hinrichs Martha G.;Most Haley;Schnoebelen Tyler J. 发明人 Munro Robert J.;Erle Schuyler D.;Walker Christopher;Luger Sarah K.;Brenier Jason;King Gary C.;Tepper Paul A.;Mechanic Ross;Gilchrist-Scott Andrew;Long Jessica D.;Robinson James B.;Callahan Brendan D.;Casbon Michelle;Sarin Ujjwal;Nair Aneesh;Basavaraj Veena;Saxena Tripti;Nunez Edgar;Hinrichs Martha G.;Most Haley;Schnoebelen Tyler J.
分类号 G06F17/24;G06F17/22;G06F17/28 主分类号 G06F17/24
代理机构 代理人
主权项 1. A method for generating a natural language model, the method comprising: ingesting, by a natural language platform comprising at least one processor coupled to at least one memory, training data representative of documents to be analyzed by the natural language model; generating, by the natural language platform and based on topical content within the training data, a hierarchical data structure, the hierarchical data structure comprising at least two topical nodes, wherein at least two topical nodes represent partitions organized by two or more topical themes among the topical content of the training data within which the training data is to be subdivided into; selecting among the training data, by the natural language platform, a plurality of documents to be annotated; generating, by the natural language platform, at least one annotation prompt for each document among the plurality of documents to be annotated, said annotation prompt configured to elicit an annotation about said document indicating which node among the at least two topical nodes of the hierarchal data structure said document is to be classified into; causing display of, by the natural language platform, at least one annotation prompt for each document among the plurality of documents to be annotated; receiving, by the natural language platform, for each document among the plurality of documents to be annotated, the annotation in response to the displayed annotation prompt; and generating, by the natural language platform, the natural language model using an adaptive machine learning process configured to determine, among the received annotations, patterns for how the documents in the training data are to be subdivided according to the at least two topical nodes of the hierarchical data structure.
地址 San Franciso CA US