发明名称 STATISTICAL STEMMING
摘要 Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating suffix rewriting rules. A method includes obtaining a plurality of canonical suffix-rewriting rules each associated with one or more words, generating a suffix tree from the words, selecting a minimum colored subset of the nodes and leaves in the suffix tree, and generating a plurality of final suffix-rewriting rules from the nodes in the minimum colored subset. Another method includes receiving applicable and non-applicable words for a suffix-rewriting rule, generating a suffix tree from the applicable words and the non-applicable words, selecting a minimum colored subset of the nodes and leaves in the suffix tree, and generating a plurality of suffix-rewriting rules, wherein each rule corresponds to a node in the minimum colored subset with a valid status.
申请公布号 US2013173250(A1) 申请公布日期 2013.07.04
申请号 US201213710055 申请日期 2012.12.10
申请人 CHEREPANOV EVGENY A.;GRUSHETSKYY OLEKSANDR;ORLOV DMITRY N.;GOOGLE INC. 发明人 CHEREPANOV EVGENY A.;GRUSHETSKYY OLEKSANDR;ORLOV DMITRY N.
分类号 G06F17/28 主分类号 G06F17/28
代理机构 代理人
主权项
地址