发明名称 Adaptive gathering of structured and unstructured data system and method
摘要 Content is obtained from a webpage accessed via a URI, which URI is obtained from a URI queue. The content is parsed for price and product information according to a parse map, with the resulting parse result being stored. The priority of URIs in the URI queue is adjusted based on analysis of the parse result for changes in price and product attributes and according to other criteria. The parse map may be one associated with the URI or a general purpose parse maps. The parse result may be validated by human- and machine-based systems, including by graphically labeling price and product information in the content for human confirmation or correction.
申请公布号 US9466066(B2) 申请公布日期 2016.10.11
申请号 US201514726707 申请日期 2015.06.01
申请人 Indix Corporation 发明人 Kalikivayi Satyanarayana Rao;Muppalla Rajesh;Parthasarathy Sanjay
分类号 G06F17/30;G06Q30/02;G06F17/27 主分类号 G06F17/30
代理机构 Schwabe, Williamson & Wyatt, P.C. 代理人 Schwabe, Williamson & Wyatt, P.C.
主权项 1. A computer implemented method of managing a prioritized Uniform Resource Identifier (“URI”) queue comprising: by a first computer processor, utilizing a first URI to access a first content in a first communication session with a first webserver associated with a first merchant at a first URI access time and utilizing the first URI to access a second content in a second communication session with the first webserver at a second URI access time subsequent to the first URI access time; by the first or a second computer processor, parsing the first content for first price and product attribute values, saving the result as a first parse result associated with a first product in a first computer memory, and associating the first parse result with a first URI-specific product identifier; by the first or the second computer processor, parsing the second content for second price and product attribute values, saving the result as a second parse result associated with the first product in the first computer memory, and associating the second parse result with a first merchant-specific identifier; by the first or the second computer processor, determining at least one difference between the first price and product attribute values in the first parse result and the second price and product attribute values in the second parse result; and setting a time to next check of the first URI in the prioritized URI queue at least according to the determined difference between the first price and product attribute values in the first parse result and the second price and product attribute values in the second parse result.
地址 Seattle WA US