主权项 |
1. Method to manage raw genomic data in a privacy preserving manner in a biobank, said raw genomic data comprising a plurality of aligned short reads of a reference DNA sequence, each short read having a position in the reference DNA sequence and comprising at least a plurality of nucleotides, the position and a cigar string, said method comprising an encryption and storage stage, carried out by a certified institution (CI), comprising the steps of:
encrypting, for each short read, the position with an order preserving encryption algorithm, encrypting, for each short read, the cigar string with a symmetric encryption algorithm, encrypting the nucleotides with a stream cipher algorithm, storing all the encrypted data in the biobank together with a patient identification,the management of the raw genomic data comprising an access stage to the raw genomic data comprising the steps of:
receiving a request by the biobank, from a medical unit (MU), comprising a patient identification and at least one specific range of nucleotides, each range being defined by a lower and an upper bound and comprising at least one short read having a maximum length, each range comprising a first and a second value allowing to determine the range, the first value being either the encrypted lower bound of the specific range or an encrypted adjusted lower bound defining an adjusted range in which the lower bound is included based on the maximum length of a short read, and the second value being the encrypted upper bound of the specific range, said first and second values having been encrypted by the medical unit (MU) with the order preserving encryption algorithm, in case that the first value is the encrypted lower bound, determining the encrypted adjusted lower bound in which the encrypted lower bound is included based on the maximum length of a short read, retrieving by the biobank at least one short read having an encrypted position within the encrypted adjusted lower bound and the encrypted upper bound, transmitting the at least one short read to a key manager (MK), decrypting the first and second values by the key manager (MK),in case that the first value is the encrypted adjusted lower bound, determining by the key manager (MK), the lower bound with the adjusted lower bound and the maximum length of a short read,
transmitting the lower and upper bond to the biobank by the key manager (MK), masking, by the biobank, the nucleotides of the retrieved short read outside the range defined by the lower and upper bound, providing the selectively masked short read for further analysis to the medical unit (MU). |