摘要 |
<p>The invention is applicable to the technical field of gene engineering, and provides a method for lowering time complexity in short sequences assembly and a system thereof. The method comprises the following steps: receiving sequencing sequences; respectively processing base sliding cutting to the received sequencing sequences one by one to obtain short strings with constant base length and to obtain left and right connection relations of the short strings; and storing the sequence values of the obtained all short strings, left and right connection relations and connection amount as one node of a de Bruijn graph, using a hash table to store the nodes of the de Bruijn graph, the hash key is the sequence value, the hash value is the node. Because of using the de Bruijn graph and applying the hash table for storing, it makes updating the connection relation of the nodes to be equal to searching nodes and updating the connection amount of bases having left and right connections for searched nodes. Thus, the searching and adding nodes and updating the connection relations of nodes can be finished during the time of 0(1). The lowering time complexity in the short sequences assembly can be realized and the short sequences of large genome can be assembled.</p> |
申请人 |
SHENZHEN HUADA GENE INSTITUTE;LI, RUIQIANG;ZHU, HONGMEI;LI, SONGGANG;WANG, JUN;YANG, HUANMING;WANG, JIAN |
发明人 |
LI, RUIQIANG;ZHU, HONGMEI;LI, SONGGANG;WANG, JUN;YANG, HUANMING;WANG, JIAN |