发明名称 SCALABLE COMPLEX EVENT PROCESSING WITH PROBABILISTIC MACHINE LEARNING MODELS TO PREDICT SUBSEQUENT GEOLOCATIONS
摘要 Provided is a process, including: obtaining a set of historical geolocations; segmenting the historical geolocations into a plurality of temporal bins; determining pairwise transition probabilities between a set of geographic places based on the historical geolocations; configuring a compute cluster by assigning subsets of the transition probabilities to computing devices in the compute cluster; receiving a geolocation stream indicative of current geolocations of individuals; selecting a computing device in the compute cluster in response to determining that the computing device contain transition probabilities for the received respective geolocation; selecting transition probabilities applicable to the received respective geolocation from among the subset of transition probabilities assigned to the selected computing device; predicting a subsequent geographic place based on the selected transition probabilities.
申请公布号 US2016328661(A1) 申请公布日期 2016.11.10
申请号 US201615147519 申请日期 2016.05.05
申请人 RetailMeNot, Inc. 发明人 Reese David John;Taberner-Miller Annette M.;Acharya Sankalp;Adam Lipphei
分类号 G06N99/00;G06F17/30;G06N7/00 主分类号 G06N99/00
代理机构 代理人
主权项 1. A system, comprising: one or more processors; and memory storing instructions that when executed by at least some of the processors effectuate operations comprising: training a machine learning model to predict subsequent geolocations of an individual, wherein training the machine learning model comprises: obtaining a set of historical geolocations of more than 500 individuals, the set of historical geolocations indicating geographic places visited by the individuals and the sequence with which the places were visited, the set of historical geolocations indicating a median number of places greater than or equal to three for the 500 individuals over a trailing duration of time extending more than one day in the past;obtaining a set of geographic places including at least 500 geographic places each corresponding to at least one of the historical geolocations;segmenting the historical geolocations into a plurality of temporal bins, each temporal bin having a start time and an end time, wherein segmenting comprises assigning the historical geolocations to a given temporal bin in response to determining that a corresponding historical geolocation is timestamped with a time after a respective start time and before a respective end time of the given temporal bin; andfor each of the segments, determining pairwise transition probabilities between the set of geographic places based on the historical geolocations in the respective temporal bin to form a transition matrix, wherein: a first dimension of the transition matrix corresponds to a first previous geographic place,a second dimension of the transition matrix corresponds to a second previous geographic place preceding the first previous geographic place,a third dimension of the transition matrix corresponds to a subsequent geographic place, andvalues of the transition matrix correspond to respective conditional probabilities of moving from the first previous geographic place to the subsequent geographic place given that the second previous geographic place precedes the first previous geographic place in a sequence of places visited by an individual;configuring a real-time, next-place-prediction stream-processor compute cluster by assigning a first subset of the transition probabilities corresponding to a first computing device in the compute cluster and assigning a second subset of the transition probabilities to a second computing device in the compute cluster;after configuring the stream-processor compute cluster, receiving, with the stream-processor compute cluster, a geolocation stream indicative of current geolocations of individuals, wherein the geolocation stream comprises over 1,000 geolocations per hour;for each geolocation in the stream, within thirty minutes of receiving the respective geolocation, predicting a subsequent geolocation and acting on the prediction, wherein predicting a subsequent geolocation and acting on the prediction comprises: selecting a computing device in the compute cluster in response to determining that the computing device contain transition probabilities for the received respective geolocation;with the selected computing device, selecting transition probabilities applicable to the received respective geolocation from among the subset of transition probabilities assigned to the selected computing device;predicting a subsequent geographic place based on the selected transition probabilities;selecting content based on the predicted subsequent geographic place; andsending the selected content to a mobile computing device that sent the received respective geolocation.
地址 Austin TX US