发明名称 |
COMMODITY INFORMATION EXTRACTION RULE GENERATING METHOD, APPARATUS AND PROGRAM |
摘要 |
PROBLEM TO BE SOLVED: To provide a commodity information extraction rule generating method and apparatus capable of generating a commodity information extraction rule at a small maintenance cost without requiring any manual operation for creating rules or leaning data.SOLUTION: An inter-page common points identifying section 25 identifies inter-page common points in which an identical character string commonly appears at a predetermined rate or more at identical points in commodity detailed information pages included in a commodity detailed information page group. A commodity attribute value extraction point determination section 27 compares the inter-page common points identified in a commodity attribute value extraction candidate and the periphery of the commodity attribute value extraction candidate and a commodity attribute characteristic of a commodity attribute for each commodity attribute to determine whether the commodity attribute value extraction candidate is the extraction point of the attribute value of the commodity attribute. A commodity information extraction rule generating section 28 generates a pair of the extraction point of each of the determined attribute value extraction and a commodity attribute name for each commodity attribute as a commodity information extraction rule. |
申请公布号 |
JP2013143021(A) |
申请公布日期 |
2013.07.22 |
申请号 |
JP20120003163 |
申请日期 |
2012.01.11 |
申请人 |
NIPPON TELEGR & TELEPH CORP <NTT> |
发明人 |
IIMURA YUKAKO;SHIOBARA TOSHIKO;TANAKA AKIMICHI;UCHIYAMA MASASHI |
分类号 |
G06F17/30;G06Q30/06 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|