发明名称 System and method for extracting content elements from multiple Internet sources
摘要 A system for automatically extracting data from at least one electronic document accessible through the Internet or other computer network. The system records a sequence of actions operable to electronically navigate to a target page of the electronic document, the target page including a plurality of elements each having contents and a structural definition wherein the structural definitions interrelate the plurality of elements to specify a target pattern for a select subset of the plurality of elements. After recording the navigation path and the target pattern, the system automatically accesses the target page according to the recorded sequence. When the target page is accessed, the system automatically identifies, copies and processes selections from the plurality of elements dependent upon the target pattern.
申请公布号 US2011185273(A1) 申请公布日期 2011.07.28
申请号 US201113005699 申请日期 2011.01.13
申请人 DACOSTA GERSON FRANCIS;GHASKADVI VIJAY;BHIDE RAHUL 发明人 DACOSTA GERSON FRANCIS;GHASKADVI VIJAY;BHIDE RAHUL
分类号 G06F17/00;G06F9/44;G06F9/45;G06F9/50;G06Q99/00 主分类号 G06F17/00
代理机构 代理人
主权项
地址