主权项 |
1. A computerized method of building a device identifier similarity model with online event signals, the method comprising:
receiving at a processing circuit a first set of network device identifiers; identifying, by the processing circuit, an online event associated with network activity of each network device identifier of the first set; identifying, using the processing circuit, for each network device identifier of the first set, one or more long-term browsing history events surrounding the identified online event based on the network device identifier's network activity, the long-term browsing history events corresponding to events occurring prior to a first time from the identified online event; identifying, using the processing circuit, for each network device identifier of the first set, one or more short-term browsing history events surrounding the identified online event based on the network device identifier's network activity, the short-term browsing history events corresponding to events occurring after the first time from the identified online event; representing, using the processing circuit, each device identifier of the first set as a vector based on feature data corresponding to each network device identifier's network activity, the feature data comprising keywords corresponding to content associated with the device identifier's network activity; applying, using the processing circuit, abstractions on the feature data to form concepts, wherein each concept represents a category of interest; deriving, using the processing circuit, at least one hierarchy of the feature data based on the keywords and concepts of the feature data; expanding, using the processing circuit, the feature data based on the derived at least one hierarchy of the feature data; applying, using the processing circuit, a clustering algorithm on each of the vectors to identify a plurality of clusters of device identifiers that share a common interest; providing, using the processing circuit, at least one subset of network device identifiers corresponding to each of the plurality of cluster; and generating, using the processing circuit, the device identifier similarity model based on the expanded feature data. |