发明名称 Generalized data mining and analytics apparatuses, methods and systems
摘要 The GENERALIZED DATA MINING AND ANALYTICS APPARATUSES, METHODS AND SYSTEMS (“GDMA”), in various embodiments, may identify statistical relationships among query terms by analyzing a corpus of electronic documents. Inputs may be automatically generated automatically and/or user provided. In one embodiment, a method includes: accessing a term tensor associated with at least one term in a corpus of documents, wherein the term tensor comprises a plurality of data type vectors corresponding respectively to a plurality of term-correlated data types correlated with the at least one term in the corpus and each data type vector comprising a plurality of binned data type values with corresponding weighted occurrence values derived from the corpus; providing at least one of the plurality of term-correlated data types for selectable display; receiving at least one term-correlated data type selection; and providing data type values associated with the at least one term-correlated data type selection for display.
申请公布号 US9183203(B1) 申请公布日期 2015.11.10
申请号 US201113252559 申请日期 2011.10.04
申请人 Quantifind, Inc. 发明人 Tuchman Ari;Galant Yaron;Nachbar Erich;Stockton John;Thiyagarajan Karthik
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Irell & Manella LLP 代理人 Irell & Manella LLP
主权项 1. A data extraction processor-implemented method, comprising: accessing, via a processor, a term tensor associated with at least one term in a corpus of documents, wherein the term tensor comprises a plurality of data type vectors corresponding respectively to a plurality of term-correlated data types contextually correlated with the at least one term in the corpus based at least in part on distances between the at least one term and each of the term-correlated data types in the documents, andeach data type vector comprising a plurality of binned data type values, the plurality of binned data type values comprising a discrete representation of term-correlated data type data associated with the each data type vector and each binned value of the plurality of binned data type values having a corresponding weighted occurrence value derived from the corpus; providing, via the processor, at least one of the plurality of term-correlated data types for selectable display; receiving, via the processor, at least one term-correlated data type selection; and providing, via the processor, a subset of the plurality of binned data type values associated with the at least one term-correlated data type selection for display.
地址 Menlo Park CA US