主权项 |
1. A method implemented and programmed within a non-transitory computer-readable storage medium and processed by machine, the machine configured to execute the method, comprising:
(a) obtaining, at the machine, a subsequence for each sequence in a sequence database and group the subsequence with a first item; (b) redistributing, at the machine, the subsequences to nodes of a parallel processing networking by a prefix value; (c) counting, at each node and in parallel, a specific prefix with a predefined length and maintaining at each node a high frequency prefix and its postfix; (d) generating, at each node and in parallel, new prefixes that combine the specific prefix and specific subsequences of its postfix; (e) iterating, at each node and in parallel, (c) and (d) until no new prefixes are generated or until a given prefix length exceeds a specified value; and (f) outputting, by the machine, all the prefixes. |