摘要 |
Using a whole byte to represent a monomer in a biological sequence is not the most efficient means of permanent storage. The invention relates to the compression of biological sequence data for electronic storage by utilising a sub-byte datatype for the storage or manipulation of biological sequence data in a programming language or a database. For nucleotide sequences, for example, 2 bits can be used to represent each monomer. |