摘要 |
A method for indexing a reference genome is provided. The method includes selecting a reference genome to index, calculating a first minimum index region size, assigning a first position number to a first index region of the reference genome, assigning a second position number to a second index region of the reference genome, and storing the association of the first and second position numbers to index regions in a hash table. The size of the first index region can be greater than or equal to the first minimum index region size. The second index region can overlap with at least one base included in the first index region. The first minimum index region size can be calculated based on the reference genome size. In yet other embodiments of the present teachings, a method for mapping a sequence read to a reference genome is provided wherein a sequence read is compared to the index regions stored in the indexing hash table, and the sequence read is mapped to and aligned against a location on the reference genome. Systems configured to carry out the methods are also provided.
|