发明名称 Methods and apparatus for information indexing and retrieval as well as query expansion using morpho-syntactic analysis
摘要 An index generator and query expander for use in information retrieval in a corpus. A corpus is provided as an input to an inflectional analyzer, which produces a lemmatized corpus having base forms and associated inflections for each word in the original corpus. The lemmatized corpus is provided as an input to a disambiguator, which performs part of speech tagging and morpho-syntactic disambiguation to produce a disambiguated corpus. The disambiguated corpus is provided as an input to a derivational generator, which produces an expanded corpus having all possible valid derivatives of each word of the disambiguated corpus. The disambiguated corpus is provided as an input to a transformational analyzer, using a grammar and a metagrammar for analyzing syntactic and morphosyntactic variations to conflate and generate variants, producing an index to the corpus having a minimum of variants. Alternatively, a query expander is provided utilizing similar techniques.
申请公布号 US6101492(A) 申请公布日期 2000.08.08
申请号 US19980109506 申请日期 1998.07.02
申请人 LUCENT TECHNOLOGIES INC. 发明人 JACQUEMIN, CHRISTIAN;TZOUKERMANN, EVELYNE
分类号 G06F17/30;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址