摘要 |
<p>Identifying clusters of protein binding sites in a nucleotide sequence under analysis. A computerized system determines likelihood parameters for a plurality of known protein binding sites. The likelihood parameter for each protein binding site represents a likelihood that the protein binding site will occur in a nucleotide sequence under analysis relative to a likelihood that the protein binding site will occur in a random nucleotide sequence of a substantially equivalent composition. Selected protein binding sites are grouped as a function of their respective likelihood parameters to determine a likelihood score, which is compared to a predetermined threshold. The selected protein binding sites in the nucleotide sequence are identified as one or more clusters if the likelihood score exceeds the predetermined threshold.</p> |