发明名称 Method to identify protein sequences that fold into a known three-dimensional structure
摘要 A computer-assisted method for identifying protein sequences that fold into a known three-dimensional structure. The inventive method attacks the inverse protein folding problem by finding target sequences that are most compatible with profiles representing the structural environments of the residues in known three-dimensional protein structures. The method starts with a known three-dimensional protein structure and determines three key features of each residue's environment within the structure: (1) the total area of the residue's side-chain that is buried by other protein atoms, inaccessible to solvent; (2) the fraction of the side-chain area that is covered by polar atoms (O, N) or water, and (3) the local secondary structure. Based on these parameters, each residue position is categorized into an environment class. In this manner, a three-dimensional protein structure is converted into a one-dimensional environment string, which represents the environment class of each residue in the folded protein structure. A 3D structure profile table is then created containing score values that represent the frequency of finding any of the 20 common amino acids structures at each position of the environment string. These frequencies are determined from a database of known protein structures and aligned sequences. The method determines the most favorable alignment of a target protein sequence to the residue positions defined by the environment string, and determines a "best fit" alignment score, Sij, for the target sequence. Each target sequence may then be further characterized by a ZScore, which is the number of standard deviations that Sij for the target sequence is above the mean alignment score for other target sequences of similar length.
申请公布号 US5436850(A) 申请公布日期 1995.07.25
申请号 US19940218685 申请日期 1994.03.28
申请人 THE REGENTS OF THE UNIVERSITY OF CALIFORNIA 发明人 EISENBERG, DAVID;BOWIE, JAMES U.;LUTHY, ROLAND
分类号 C07K1/00;G01N33/68;G06F17/50;(IPC1-7):G06F19/00;C12Q1/68 主分类号 C07K1/00
代理机构 代理人
主权项
地址