发明名称 |
INFORMATION EXTRACTION METHOD FROM STRUCTURED DOCUMENT, INFORMATION EXTRACTION PROGRAM, AND STORAGE MEDIUM STORING INFORMATION EXTRACTION PROGRAM |
摘要 |
<P>PROBLEM TO BE SOLVED: To accurately specify a part desired by a user to be altered in an HTML altered daily. <P>SOLUTION: A combination of a tag name corresponding to the root of a subtree, a name of a format attribute of the tag, and a value of the format attribute is set to an identifier of the tag and the identifier of the tag is set to the identifier of the subtree corresponding thereto. When there are a plurality of format attributes in the identifier of the tag, the identifier of the tag is normalized by aligning the format attributes in the order of the format attribute names, and the subtree having the identical identifier of the subtree with the initially obtained identifier of the subtree is specified as the specified part from a list of the identifiers of the subtrees existing in the document converted into the tree structure. <P>COPYRIGHT: (C)2004,JPO |
申请公布号 |
JP2004038263(A) |
申请公布日期 |
2004.02.05 |
申请号 |
JP20020190621 |
申请日期 |
2002.06.28 |
申请人 |
NIPPON TELEGR & TELEPH CORP <NTT> |
发明人 |
MIYAMOTO MASARU;UCHIYAMA TADASHI |
分类号 |
G06F17/21;G06F12/00;G06F17/30 |
主分类号 |
G06F17/21 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|