摘要 |
A system for processing information contained in a collection of text-based information sources employs associative and linguistic expansion of input words in which associative expansion is first performed, followed by simultaneous linguistic expansion in accordance with related morphological and phonetic rules. The system automatically generates and updates a linguistic knowledge base for each language to be processed by analyzing a large body of text in each language. The system also automatically indexes the collection of text-based information sources to be searched. A method is provided to expand a word or term in a supported language using a two-dimensional (2D) expansion matrix providing great flexibility, high accuracy and low noise output. The 2D expansion matrix includes an associative dimension that utilizes thesauri, databses of saved queries and other associated information sources, in which words are related to other words by meaning and relations, and a linguistic dimension which utilizes recognition-grammars, in which words are related to other words by combined rules for morphological and phonetic variation. |