发明名称 System and method for recursively traversing the internet and other sources to identify, gather, curate, adjudicate, and qualify business identity and related data
摘要 A system and a method used for data discovery in accordance with an inquiry in which multiple sources, which may be web sites or other data sources, are examined for data relevant to the inquiry. The process and method is performed recursively an indeterminate number of iterations, using data and metadata from multiple sources to corroborate discovered data and metadata from other sources, until no further relevant data or sources are found, or adjudication or exception rules have been met. Discovered data and metadata are curated, adjudicated to assess reliability, synthesized, and clustered into composite records using precedence rules and provenance to determine the most reliable data sources as well as terms of use for each source. Data, metadata, and information about each search are retained and can be used for subsequent purposes, such as subsequent searches or other downstream activities.
申请公布号 US9390176(B2) 申请公布日期 2016.07.12
申请号 US201314047837 申请日期 2013.10.07
申请人 THE DUN & BRADSTREET CORPORATION 发明人 Scriffignano Anthony J.;Klein Michael;Hoang Thang Q.;Rampaul Vindra;Davies Robin;Reddi Anjali
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Ohlandt, Greeley, Ruggiero & Perle, L.L.P. 代理人 Ohlandt, Greeley, Ruggiero & Perle, L.L.P.
主权项 1. A computer readable non-transitory storage medium storing instructions of a computer program, which when executed by a computer system, results in performance of steps comprising: a) providing a plurality of initial search targets based upon said inquiry of a plurality of seed sources; b) searching a first one of the plurality of initial search targets to obtain data and metadata relevant to the inquiry; c) repeating step (b) until all said data and metadata of the first one of the plurality of initial search targets have been collected and stored; d) searching a next one of the plurality of initial search targets to obtain additional data and metadata relevant to the inquiry; e) repeating step (d) until all said additional data and metadata of the next one of the plurality of initial search targets have been collected and stored; f) repeating steps (d) and (e) until all of the plurality of initial search targets have been searched and said additional data and metadata have been collected and stored; g) processing said data and metadata and said additional data and metadata stored in steps (c) and (e), respectively, thereby generating additional search targets; h) repeating steps (b) through (f) for said additional search targets; and i) repeating steps (g) and (h) until no said additional data and metadata are found; and validating given data by executing steps comprising: comparing the data from search targets that have been searched, and selecting as valid the data from a source considered to be most reliable and usable, based on a set of precedence and usage rules.
地址 Short Hills NJ US