AUTOMATIC GENERATION OF N-GRAMS AND CONCEPT RELATIONS FROM LINGUISTIC INPUT DATA
摘要
A method of automatically generating a lemma dictionary from a web resource may include extracting a plurality of tokens from text-based documents within the web resource, and generating a plurality of N-grams from the plurality of tokens. The method may additionally include receiving one or more filter definitions that identify valid N-grams, and filtering the plurality of N-grams using the one or more filter definitions to generate a lemma dictionary. The method may further include generating an ontology that comprises the lemma dictionary.
申请公布号
WO2016077016(A1)
申请公布日期
2016.05.19
申请号
WO2015US55490
申请日期
2015.10.14
申请人
ORACLE INTERNATIONAL CORPORATION
发明人
NAUZE, FABRICE;KISSIG, CHRISTIAN;ZARAFIN, MADALINA;VILLADA-MOIRON, MARIA BEGONA;GENET, ROOS