摘要 |
Genomic sequencing is implemented for high throughput applications that can include short reads. In one example, whole-genome sequencing involves a method in which a subset of fragments of a target genome are selected as a random function, and each fragment is replicated into clones. The clones are ordered into clone contigs based on sets of overlapping clones, and potential read overlaps are determined from clone read data. The method can also involve reading local assemblies of contigs from regions smaller than a clone length and assembling the local assemblies into read sets, combining the assembled read sets into clone-sized regions and assembling the clone-sized regions, and assembling the clone-sized regions into clone contigs. Overlapping sets of clones and their ordering can be determined computationally from read data, with a high depth of clone coverage to provide a large number of boundaries on which the assemblies can be segmented into overlapping regions of pooled reads.
|