发明名称 Systems and Methods for Controlling Crawling Operations to Aggregate Information Sets With Respect to Named Entities
摘要 Customer Insight (CI) systems in accordance with various embodiments of the invention gather information sets from multiple remote information sources and can merge the information sets to identify authoritative information describing the named entity. In several embodiments, the information sets and/or the authoritative information are identified using geographic location information associated with the information sets. In many embodiments, the CI systems identify relationship information within the merged information sets and use the relationship information to identify customers of businesses. Once identified, merged and/or authoritative information sets describing customers can be used to build customer lists, typical customer profiles, and best customer profiles. In addition, the CI system can utilize information describing customers to automatically generate advertising targeting data and online advertising campaigns.
申请公布号 US2016171113(A1) 申请公布日期 2016.06.16
申请号 US201414586830 申请日期 2014.12.30
申请人 Connectivity, Inc. 发明人 Fanous Emad Joseph;Booth Matthew D.
分类号 G06F17/30;G06Q30/02 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method of scheduling crawling remote electronic information sources in response to identification of new pieces of characteristic data describing named entities using a customer insight system, the method comprising: generating a user interface enabling submission of real-time information requests using a customer insight system; scheduling crawls of remote electronic information sources using the customer insight system, where the scheduled crawls: continuously gather sets of characteristic data from a plurality of different types of remote electronic information sources, wherein the gathered characteristic data comprises data selected from the group comprising unique identifiers, geographic location data, and text data; andstore the gathered characteristic data in a crawler database; andparsing gathered characteristic data in the crawler database from specific remote electronic information sources for storage as sets of characteristic data within a feeds database using the customer insight system;merging sets of characteristic data stored in the feeds database to create merged information sets associated with unique identifiers using the customer insight system, wherein the merged information sets are stored in the feeds database, and wherein merging sets of characteristic data stored in the feeds database to create merged information sets further comprises:merging sets of characteristic data in the feeds database that contain matching unique identifiers;merging sets of characteristic data in the feeds database that do not contain matching unique identifiers based on a comparison of geographical location data, wherein the comparison of geographic location data comprises: determining a distance between geographic locations contained in geographic location data included in a first set of characteristic data and a second set of characteristic data in the feeds database; andmerging the first set of characteristic data with the second set of characteristic data to create a merged information set when the determined distance is within a threshold distance; identifying, using the customer insight system, an addition of at least one new piece of characteristic data describing a given named entity to the merged information sets for the given named entity in the feeds database, wherein the at least one new piece of characteristic data describing the given named entity added to the merged information sets for the given named entity comprises a new piece of characteristic data identifying a different, previously unknown named entity; generating an authoritative information set for a given named entity using characteristic data from the merged information sets for the given named entity contained within the feeds database and using the customer insight system, wherein the authoritative information set includes a single selection of characteristic data for any particular type of characteristic data of the given named entity; storing the authoritative information set for the given named entity in a production database maintained by the customer insight system; scheduling additional crawls of remote electronic information sources utilizing the at least one new piece of characteristic data in the feeds database describing the given named entity from the merged information sets in response to identifying the at least one new piece of characteristic data using the customer insight system; scheduling additional crawls of remote electronic information sources utilizing the new piece of characteristic data in the feeds database describing the different, previously unknown named entity using the customer insight system; receiving a real-time information request with respect to a specific named entity corresponding to a particular business through the generated user interface using the customer insight system; scheduling additional crawls of remote electronic information sources utilizing attributes of the specific named entity inferred from the real-time information request using the customer insight system; adjusting priorities of scheduled crawls of remote electronic information sources such that scheduled crawls of remote electronic information sources for information concerning the specific named entity are at a higher priority than previously scheduled additional crawls of remote electronic information sources using the customer insight system; and generating a user interface displaying information concerning the specific named entity using the customer insight system and updating the user interface in real-time as additional information sets are merged into the information sets for the specific named entity.
地址 Burbank CA US