发明名称 System and method for use in text analysis of documents and records
摘要 Methods and systems are provided that enable text in various sections of data records to be separately catalogued, indexed, or vectorized for analysis in a text visualization and mining system. A text processing system receives a plurality of data records, where each data record has one or a plurality of attribute fields associated with the records. The attributes fields containing textual information are identified. The specific textual content of each attribute field is identified. An index is generated that associates the textual content contained in each attribute field with the attribute field containing the textual content. The index is operable for use in text processing. The plurality of data records may be located in a data table and the textual information may be contained within cells of the data table. In another aspect, a plurality of data records is received, where at least some of the data records contain text terms. A first method is applied to weight text terms of the data records in a first manner to aid in distinguishing records from each other in response to selection of the first method. A second method is applied to weight text terms of the data records in a second manner to aid in distinguishing records from each other in response to selection of the second method. A vector is generated to distinguish each of the data records based on the text terms weighted by either the first or second method.
申请公布号 AU1130202(A) 申请公布日期 2002.04.08
申请号 AU20020011302 申请日期 2001.10.01
申请人 BATTELLE MEMORIAL INSTITUTE 发明人 VERNON L. CROW;RANDALL E. SCARBERRY;AUGUSTIN J. CALAPRISTI;NANCY E. MILLER;GRANT C. NAKAMURA;JEFFREY D. SAFFER
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址