发明名称 DOCUMENT CLASSIFICATION PROGRAM, SERVER AND METHOD BASED ON SENTENCE FEATURES AND PHYSICAL FEATURES OF DOCUMENT INFORMATION
摘要 <P>PROBLEM TO BE SOLVED: To provide a document classification program capable of enhancing determination accuracy based on a specific category (e.g., illegality and harmfulness) for Web document information. <P>SOLUTION: Document information is described with sentence information and a markup language. The document classification program causes a computer to function as: document information separation means that separates object document information to be an analysis object into sentence information and markup language information; feature amount generation means that counts the number of times a character strings registered in advance appears for each of the sentence information and the markup language information, and generates a feature amount of a multidimensional vector indicating the number of appearances for every character string element; feature amount determination means that determines whether or not the object feature amount of the object document information falls in a specific range of learning feature amount obtained from a large amount of learning document information included in a specific category; and category classification means that classifies object document information determined to be true by the feature amount determination means as information included in the specific category. <P>COPYRIGHT: (C)2012,JPO&INPIT
申请公布号 JP2012043285(A) 申请公布日期 2012.03.01
申请号 JP20100185321 申请日期 2010.08.20
申请人 KDDI CORP 发明人 IKEDA KAZUFUMI;YANAGIHARA TADASHI;MATSUMOTO KAZUNORI;ONO TOSHIHIRO;TAKISHIMA YASUHIRO
分类号 G06F17/30;G06N3/00 主分类号 G06F17/30
代理机构 代理人
主权项
地址