发明名称 COMMODITY INFORMATION EXTRACTION RULE GENERATING METHOD, APPARATUS AND PROGRAM
摘要 PROBLEM TO BE SOLVED: To provide a commodity information extraction rule generating method and apparatus capable of generating a commodity information extraction rule at a small maintenance cost without requiring any manual operation for creating rules or leaning data.SOLUTION: An inter-page common points identifying section 25 identifies inter-page common points in which an identical character string commonly appears at a predetermined rate or more at identical points in commodity detailed information pages included in a commodity detailed information page group. A commodity attribute value extraction point determination section 27 compares the inter-page common points identified in a commodity attribute value extraction candidate and the periphery of the commodity attribute value extraction candidate and a commodity attribute characteristic of a commodity attribute for each commodity attribute to determine whether the commodity attribute value extraction candidate is the extraction point of the attribute value of the commodity attribute. A commodity information extraction rule generating section 28 generates a pair of the extraction point of each of the determined attribute value extraction and a commodity attribute name for each commodity attribute as a commodity information extraction rule.
申请公布号 JP2013143021(A) 申请公布日期 2013.07.22
申请号 JP20120003163 申请日期 2012.01.11
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 IIMURA YUKAKO;SHIOBARA TOSHIKO;TANAKA AKIMICHI;UCHIYAMA MASASHI
分类号 G06F17/30;G06Q30/06 主分类号 G06F17/30
代理机构 代理人
主权项
地址
您可能感兴趣的专利