发明名称 Systems and methods for sentence comparison and sentence-based search
摘要 Systems and methods for performing logical semantic sentence comparisons and sentence-based searches. Training is performed by running an NLP pipeline on unstructured text comprising sentences and creating sentence matrix representations on the unstructured text; storing the matrix representations in an indexed database; combining the stored matrix representations; running an SVD on the combined matrix; storing the SVD components in the indexed database; reiterating through the output of the NLP pipeline the sentences of the unstructured training text to form a low-dimensional matrix conversion for each sentence for storage in the database based on the calculated SVD components. Subsequent query statements are run through the same process based and converted into low-dimensional matrix representations using the SVD components from training; the low-dimensionality query matrix is compared to the stored low-dimensional matrices to determine the closest relevant documents, that are returned to the user.
申请公布号 US9176949(B2) 申请公布日期 2015.11.03
申请号 US201213543626 申请日期 2012.07.06
申请人 ALTAMIRA TECHNOLOGIES CORPORATION 发明人 Bullock Bennett Charles;Law Daniel A.;Hurtado Arthur D.
分类号 G06F17/27 主分类号 G06F17/27
代理机构 The Marbury Law Group, PLLC 代理人 The Marbury Law Group, PLLC
主权项 1. A computing device for creating a searchable database using logical semantic structure of sentences, comprising: a memory; a datastore; and a processor coupled to the memory, wherein the processor is configured with processor-executable instructions to perform operations comprising: receiving unstructured training text;running a natural language processor (NLP) pipeline on the unstructured training text, the unstructured training text comprising sentences;creating sentence matrix representations of the unstructured training text based on output of running the NLP pipeline on the unstructured training text, each of the sentence matrix representations corresponding to a semantic structure of an individual sentence;storing the sentence matrix representations in an indexed datastore;combining the stored sentence matrix representations in a sum to form a training matrix;performing a Singular Value Decomposition (SVD) computation on the training matrix to calculate SVD components;storing the calculated SVD components in the indexed datastore;applying the calculated SVD components to each of the sentence matrix representations to form a low-dimensional matrix representation for each of the sentences of the unstructured training text; andstoring the low-dimensional matrix representations in the indexed datastore.
地址 McLean VA US