发明名称 INFORMATION RETRIEVAL SYSTEM
摘要 <P>PROBLEM TO BE SOLVED: To solve the problems that it takes time to process morpheme analysis in document sorting by conventional machine learning, and sorting precision is deteriorated due to frequent false detection of a name of a person. <P>SOLUTION: An information retrieval system comprises: a characteristic token extraction means for associating the comparison conditions between a character string and a keyword with a characteristic token to extract the characteristic token from the character string in a document; a non-characteristic token extraction means for extracting a non-characteristic token, where the character string from which no characteristic token have been extracted is divided into character units; a learning means for calculating the appearance frequency of a first token train composed of a first characteristic token and a first non-characteristic token in a document for learning as a learning frequency in association with a category; and a sorting means for sorting the document to be sorted by calculating a sorting probability indicating the similarity between the appearance frequency of a second token train composed of a second characteristic token and a second non-characteristic token in a document to be sorted and the learning frequency for each category. <P>COPYRIGHT: (C)2009,JPO&INPIT
申请公布号 JP2009098952(A) 申请公布日期 2009.05.07
申请号 JP20070270253 申请日期 2007.10.17
申请人 MITSUBISHI ELECTRIC CORP 发明人 KATO MAMORU;KORI MITSUNORI
分类号 G06F17/30;G06F17/21 主分类号 G06F17/30
代理机构 代理人
主权项
地址
您可能感兴趣的专利