主权项 |
1. A method of detecting a copy number variation comprising following steps:
obtaining reads from at least one part of a nucleic acid molecule of a sample, determining uniquely-mapped reads aligned to a (genomic) reference sequence based on the obtained reads, dividing the genomic reference sequence into a plurality of windows, and calculating the number of uniquely-mapped reads falling into each of the plurality of windows, subjecting the number of uniquely-mapped reads falling into each of the plurality of windows to a GC correction, and to a correction based on an expected number of uniquely-mapped reads adjusted by a control set to obtain a corrected number of uniquely-mapped reads, calculating a significance value of the difference between two numerical populations each consisting of the corrected numbers of uniquely-mapped reads falling into windows on each of the two sides of a demarcation point, the demarcation point being a starting point or an ending point of each of the plurality of windows, to thereby select the demarcation point having a smaller significance value as a candidate CNV breakpoint; calculating a significance value of the difference between two numerical populations each consisting of the corrected number of uniquely-mapped reads falling into windows contained within each of two sequences, with one sequence ranging from a given candidate CNV breakpoint to an adjacent upstream candidate CNV breakpoint, and the other sequence ranging from the given candidate CNV breakpoint to an adjacent downstream candidate CNV breakpoint, and removing the candidate CNV breakpoint having the least significance at every turn and recalculating the significance value for the two candidate CNV breakpoints adjacent to the removed candidate CNV breakpoint, performing cyclic iteration until the significance values of all candidate CNV breakpoints are less than a termination threshold value, to thereby determine the CNV breakpoint. |