发明名称 SYSTEM AND METHOD FOR TEXT CATEGORIZATION BASED ON ONTOLOGIES
摘要 A system for text categorization based on ontologies comprising data collector software modules; a categorizer software module; and a database comprising an indexed database of documents and their categorizations, and further comprising a plurality of ontologies, each ontology comprising a plurality of hierarchical taxonomies and each hierarchical taxonomy comprising a plurality of taxons. The data collector software modules receive a document to be classified and submit them to the categorizer software module; and the categorizer performs the following steps to categorize each document: splitting the document into sentences; selecting words or phrases that are present in ontologies stored in the database server; selecting a plurality of subtrees from the ontologies based on the presence of specific subcategories in the document; determining a weight for each subcategory; pruning subcategories having a weight below a threshold; and for each of the plurality of modified subtrees, computing a conditionality coefficient.
申请公布号 US2013212111(A1) 申请公布日期 2013.08.15
申请号 US201313872022 申请日期 2013.04.26
申请人 CHASHCHIN KIRILL;ANSHUKOV SERGEY;BARDIN VALERY;KORDONSKY SIMON 发明人 CHASHCHIN KIRILL;ANSHUKOV SERGEY;BARDIN VALERY;KORDONSKY SIMON
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址
您可能感兴趣的专利