发明名称 INFORMATION EXTRACTION METHOD FROM STRUCTURED DOCUMENT, INFORMATION EXTRACTION PROGRAM, AND STORAGE MEDIUM STORING INFORMATION EXTRACTION PROGRAM
摘要 <P>PROBLEM TO BE SOLVED: To accurately specify a part desired by a user to be altered in an HTML altered daily. <P>SOLUTION: A combination of a tag name corresponding to the root of a subtree, a name of a format attribute of the tag, and a value of the format attribute is set to an identifier of the tag and the identifier of the tag is set to the identifier of the subtree corresponding thereto. When there are a plurality of format attributes in the identifier of the tag, the identifier of the tag is normalized by aligning the format attributes in the order of the format attribute names, and the subtree having the identical identifier of the subtree with the initially obtained identifier of the subtree is specified as the specified part from a list of the identifiers of the subtrees existing in the document converted into the tree structure. <P>COPYRIGHT: (C)2004,JPO
申请公布号 JP2004038263(A) 申请公布日期 2004.02.05
申请号 JP20020190621 申请日期 2002.06.28
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 MIYAMOTO MASARU;UCHIYAMA TADASHI
分类号 G06F17/21;G06F12/00;G06F17/30 主分类号 G06F17/21
代理机构 代理人
主权项
地址