发明名称 SCALABLE WEB DATA EXTRACTION
摘要 Example embodiments relate to scalable web data extraction. In example embodiments, a joint potential function is defined for data record segments of web data extracted from a web page, where the joint potential function models data record segmentation of the web data and dependencies between pairs of data segments in the data record segments. At this stage, a principal record segment and several related record segments are identified from the data record segments, where each of the plurality of related record segments is associated with the principal record segment. A related attribute is determined for each related record segment. Next, the joint potential function is applied to the principal record segment and each corresponding related segment to determine a relationship label that describes a data relationship between the principal record segment and the corresponding related segment.
申请公布号 WO2016090625(A1) 申请公布日期 2016.06.16
申请号 WO2014CN93670 申请日期 2014.12.12
申请人 HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.;YU, XIAO-FENG;XIE, JUN-QING 发明人 YU, XIAO-FENG;XIE, JUN-QING
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址
您可能感兴趣的专利