发明名称 SYSTEM AND METHOD FOR TEXT CATEGORIZATION BASED ON ONTOLOGIES
摘要 <p>A system for text categorization based on ontologies comprising data collector software modules; a categorizer software module; and a database comprising an indexed database of documents and their categorizations, and further comprising a plurality of ontologies, each ontology comprising a plurality of hierarchical taxonomies and each hierarchical taxonomy comprising a plurality of taxons. The data collector software modules receive a document to be classified and submit them to the categorizer software module; and the categorizer performs the following steps to categorize each document: splitting the document into sentences; selecting words or phrases that are present in ontologies stored in the database server; selecting a plurality of subtrees from the ontologies based on the presence of specific subcategories in the document; determining a weight for each subcategory; pruning subcategories having a weight below a threshold; and for each of the plurality of modified subtrees, computing a conditionality coefficient.</p>
申请公布号 WO2014176600(A1) 申请公布日期 2014.10.30
申请号 WO2014US35735 申请日期 2014.04.28
申请人 SOUTH EASTERN PUBLISHERS INC.;CHASHCHIN, KIRILL;ANSHUKOV, SERGEY;BARDIN, VALERY;KORDONSKY, SIMON 发明人 CHASHCHIN, KIRILL;ANSHUKOV, SERGEY;BARDIN, VALERY;KORDONSKY, SIMON
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址