摘要 |
<p>This method enables computational analysis and manipulation of DNA and protein sequence data such as is found in large public databases. The method allows systematic searches of such data to identify portion of sequences which code for key intermolecular surfaces or regions of specific protein targets. In a first example, two amino acid sequences are input (steps 1,2) to an iterative procedure (steps 4-6). A frame size is selected, in terms of a number of sequence elements. The procedure then compares pairs of frames, one from each sequence, to identify intramolecular and intermolecular regions on the basis of relationships between amino acids according to a predetermined coding scheme. The probability of existence of each region within the coding scheme is then evaluated and those regions for which the probability is greater than a predetermined threshold are discarded. The procedure outputs the remaining regions.</p> |