发明名称 METHODS OF PREDICTING AND DETERMINING MUTATED mRNA SPLICE ISOFORMS
摘要 Mutations that affect mRNA splicing often produce multiple mRNA isoforms containing different exon structures. Definition of an exon and its inclusion in mature mRNA relies on joint recognition of both acceptor and donor splice sites. The instant methodology predicts cryptic and exon skipping isoforms in mRNA produced by splicing mutations from the combined information contents and the distribution of the splice sites and other regulatory binding sites defining these exons. In its simplest form, the total information content of an exon, Ri,total, is the sum of the information contents of its corresponding acceptor and donor splice sites, adjusted for the self-information of the exon length. Differences between Ri,total values of mutant versus normal exons are consistent with the relative abundance of these exons in distinct processed mRNAs. Predictions of splicing mutations based on Ri,total are highly concordant with published expression data demonstrating alterations in the structures and relative abundance of the mRNA transcripts derived from these mutations.
申请公布号 US2014199698(A1) 申请公布日期 2014.07.17
申请号 US201414154905 申请日期 2014.01.14
申请人 Rogan Peter Keith;Mucaki Eliseos John 发明人 Rogan Peter Keith;Mucaki Eliseos John
分类号 C12Q1/68 主分类号 C12Q1/68
代理机构 代理人
主权项 1. A method for assessing changes in expression level of a gene having an mRNA splice-altering mutation, said mutation being located within a sequence window circumscribing an exon and one or more intronic sequences of said gene, said one or more intronic sequences being adjacent to said exon, said method comprising the steps of: (a) computing and identifying changes in individual information contents of potential donor and acceptor splice sites at each nucleotide position by computing product of the information theory-based position weight matrices and a unitary position matrix of each sequence, (b) defining potential exons by selecting every pair combination of acceptor and donor splice sites in the sequence window, and determining the gap surprisal value based on distance in nucleotides between sites comprising a pair combination, wherein the gap surprisal value is calculated for each potential exon length based on frequency of said length in the genome as the inverse log2 of said frequency, (c) computing the total information content, Ri,total, of a potential exon as the sum of the corresponding individual information contents of the acceptor and donor pair, corrected by adding the gap surprisal of an exon whose length is the distance between the donor and acceptor pair, (d) comparing the Ri,total values of all potential mRNA splice isoforms of the wild-type gene and the same values after the wild-type gene sequence is mutated to determine whether the mutation alters the abundance of the mRNA isoforms containing the exon, wherein the splice isoform with the largest Ri,total value is predicted to be the most abundant splice isoform, and the splice isoform with the smallest Ri,total value is the least abundant isoform, and (e) extracting mRNAs or proteins from at least one cell expressing said gene to determine the most abundant mRNA splice isoform of said gene, thus allowing the assessing of changes in expression level of said gene.
地址 London CA