发明名称 Extraction of datapoints from markup language documents
摘要 An extraction-rule generation and training system uses information obtained from multiple markup language documents (e.g. web pages) of similar structure to generate an extraction rule for extracting datapoints from markup language documents. Where the structures of two or more documents are not sufficiently similar, the system maintains separate extraction rules for the same datapoint, and applies these separate extraction rules in combination to particular markup language documents to extract the datapoint.
申请公布号 US7954053(B2) 申请公布日期 2011.05.31
申请号 US20100683190 申请日期 2010.01.06
申请人 ALEXA INTERNAET 发明人 ORELIND GREGER J.;JAENICKE AUGUST A.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址