发明名称 PARALLEL FREQUENT SEQUENTIAL PATTERN DETECTING
摘要 Techniques for parallel frequent sequential pattern detection are provided. A sequence database is split into separate datasets and each node is given a specific dataset to resolve specific frequent items occurring in its specific dataset based on counts. Then, each node groups its item frequent items into “n” (varying) length sequences representing sequential patterns present in the original sequence database. The nodes process in parallel with one another and collectively produce a complete set of the sequential patterns defined in the original sequence database.
申请公布号 US2016070763(A1) 申请公布日期 2016.03.10
申请号 US201314361132 申请日期 2013.05.31
申请人 ZHAO Lijun;TERADATA US, INC. 发明人 Wang Yu;Liu Yuyang;Liu Huijun;Zhao Lijun;Wu Wenjie
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A method implemented and programmed within a non-transitory computer-readable storage medium and processed by machine, the machine configured to execute the method, comprising: (a) obtaining, at the machine, a subsequence for each sequence in a sequence database and group the subsequence with a first item; (b) redistributing, at the machine, the subsequences to nodes of a parallel processing networking by a prefix value; (c) counting, at each node and in parallel, a specific prefix with a predefined length and maintaining at each node a high frequency prefix and its postfix; (d) generating, at each node and in parallel, new prefixes that combine the specific prefix and specific subsequences of its postfix; (e) iterating, at each node and in parallel, (c) and (d) until no new prefixes are generated or until a given prefix length exceeds a specified value; and (f) outputting, by the machine, all the prefixes.
地址 Haidian District CN