发明名称 ANCESTRAL-SPECIFIC REFERENCE GENOMES AND METHODS OF CONSTRUCTING
摘要 Ancestry has a significant impact on the major and minor alleles found in each nucleotide position within the genome. Due to mechanisms of inheritance, ancestral-specific information contained within the genome is conserved within members of an ancestry. For this reason, individuals within a specific ancestry are more likely to share alleles in their genomes with other members of the same ancestry. Functionally, the combination of alleles at all positions within a group of individuals defines that group as having a common ancestry. Moreover, the aggregation of differences between alleles at all positions distinguishes one ancestry from another. The genomic similarities and differences between ancestries provides a mechanism to generate reference genomes that are specific for each ancestry. Reference genomes that are specific to an ancestry can be used to increase the accuracy of whole genome sequencing, DNA-based diagnostics and therapeutic marker discovery and in a variety of real-world DNA-based applications. Provided herein are methods for constructing an ancestral-specific reference genome database.
申请公布号 US2017017757(A1) 申请公布日期 2017.01.19
申请号 US201615249408 申请日期 2016.08.27
申请人 Inova Health System 发明人 Vockley Joseph;Niederhuber John
分类号 G06F19/22;C40B30/02;G06F19/14 主分类号 G06F19/22
代理机构 代理人
主权项 1. A method for constructing an ancestral-specific reference genome, the method comprising: a) obtaining a familial genome data set comprising DNA sequences from members of a family; b) comparing the DNA sequences within the familial genome data set to obtain a corrected familial genome data set; c) preparing a first composite familial genome data set from the corrected familial genome data set; d) repeating steps a-c for a second, third or more families to obtain a second, third or more composite familial genome data sets; e) evaluating the first, second, third or more composite familial genome data sets for single nucleotide polymorphisms (SNPs) and/or haplotypes and assigning statistical significance to the SNPs and/or haplotypes; f) grouping the first, second, third or more composite familial genome data sets based on single nucleotide polymorphisms (SNPs) and/or haplotypes that are statistically significant; and g) preparing the ancestral-specific reference genome by compiling the SNPs and/or haplotypes shared by a group of composite familial genome data sets with the same ancestry.
地址 Falls Church VA US