摘要 |
The GENERALIZED DATA MINING AND ANALYTICS APPARATUSES, METHODS AND SYSTEMS (“GDMA”), in various embodiments, may identify statistical relationships among query terms by analyzing a corpus of electronic documents. Inputs may be automatically generated automatically and/or user provided. In one embodiment, a method includes: accessing a term tensor associated with at least one term in a corpus of documents, wherein the term tensor comprises a plurality of data type vectors corresponding respectively to a plurality of term-correlated data types correlated with the at least one term in the corpus and each data type vector comprising a plurality of binned data type values with corresponding weighted occurrence values derived from the corpus; providing at least one of the plurality of term-correlated data types for selectable display; receiving at least one term-correlated data type selection; and providing data type values associated with the at least one term-correlated data type selection for display. |
主权项 |
1. A data extraction processor-implemented method, comprising:
accessing, via a processor, a term tensor associated with at least one term in a corpus of documents,
wherein the term tensor comprises a plurality of data type vectors corresponding respectively to a plurality of term-correlated data types contextually correlated with the at least one term in the corpus based at least in part on distances between the at least one term and each of the term-correlated data types in the documents, andeach data type vector comprising a plurality of binned data type values, the plurality of binned data type values comprising a discrete representation of term-correlated data type data associated with the each data type vector and each binned value of the plurality of binned data type values having a corresponding weighted occurrence value derived from the corpus; providing, via the processor, at least one of the plurality of term-correlated data types for selectable display; receiving, via the processor, at least one term-correlated data type selection; and providing, via the processor, a subset of the plurality of binned data type values associated with the at least one term-correlated data type selection for display. |