发明名称 Systems and methods for natural language processing including morphological analysis, lemmatizing, spell checking and grammar checking
摘要 In some embodiments, a linguistic application exploits a linguistic knowledgebase (LKB) including, among others, lexicon data, inflection form data, and syntax data for a natural language such as English or Romanian. The application employs a set modules including a word retriever, a form generator, and a syntax checker, which are interconnected to perform a number of higher-level text-processing operations such as synthetic and analytic annotation, lemmatizing, spell checking, and grammar checking.
申请公布号 US8762130(B1) 申请公布日期 2014.06.24
申请号 US200912486546 申请日期 2009.06.17
申请人 Softwin SRL Romania 发明人 Diaconescu Stefan Stelian;Dumitrascu Ionut Mihai;Ingineru Cristi Iulian;Bulibasa Oana-Adriana;Rizea Monica Mihaela;Paun Bianca-Daniela
分类号 G06F17/27;G06F17/28;G06F17/21 主分类号 G06F17/27
代理机构 Law Office of Andrei D Popovici, PC 代理人 Law Office of Andrei D Popovici, PC
主权项 1. A system comprising at least one computer configured to form: a linguistic knowledgebase (LKB) for a natural language, the LKB comprising a set of computer-readable lexicon declarations, a set of computer-readable inflected form declarations, and a set of computer-readable syntax rule declarations; a computer-implemented word retriever connected to the LKB and configured to: receive a first word,perform a lookup of an inflected form declaration of the first word in the LKB, in response to performing the lookup of the inflected form declaration, perform a lookup of a lexicon declaration of the first word in the LKB,determine a first word interpretation of the first word according to the lexicon declaration and the inflected form declaration, the first word interpretation comprising a lemma of the first word and an inflection indicator of the first word; a computer-implemented form generator connected to the word retriever and configured to: receive a second word not necessarily distinct from the first word,produce a first set of words, each word of the first set of words having a predetermined spelling similarity to the second word, andfor each word of the first set of words, receive from the word retriever a second word interpretation of said each word of the first set of words; a computer-implemented synthetic annotator connected to the word retriever and configured to: receive a word sequence,for each word of the word sequence, receive from the word retriever a third word interpretation of said each word of the word sequence, anddetermine a synthetic annotation of the word sequence, the synthetic annotation comprising the third word interpretation of said each word of the word sequence; and a computer-implemented syntax checker connected to the synthetic annotator and configured to: receive the synthetic annotation from the synthetic annotator,perform a lookup of a syntax rule declaration of the word sequence in the LKB according to the synthetic annotation, andperform a syntactic analysis of the word sequence according to the syntax rule declaration, to determine a synthetic dependency tree of the word sequence.
地址 Bucharest RO