发明名称 |
System and Method for Shifting Dates in the De-Identification of Datasets |
摘要 |
A system and method of performing date shifting with randomized intervals for the de-identification of a dataset from a source database containing information identifiable to individuals is provided. The de-identified dataset is retrieved comprising a plurality of entries or records containing personal identifying information. Date quasi-identifiers in the dataset for the entries can be identified within the data set which may be used potentially identifiable for a patient. Date events are consolidated in the date quasi-identifiers and connected dates in the dataset. The date events are moved relative to an anchor date in a longitudinal sequence of the date events. De-identification of the entries in the dataset including the date quasi-identifiers is performed to meet a risk metric defining risk of re-identified patients associated with the records. |
申请公布号 |
US2015339496(A1) |
申请公布日期 |
2015.11.26 |
申请号 |
US201514720009 |
申请日期 |
2015.05.22 |
申请人 |
University of Ottawa ;Privacy Analytics |
发明人 |
EL EMAM Khaled;ARBUCKLE Luk;EZE Ben;GREEN Geoffrey |
分类号 |
G06F21/62;G06F17/30 |
主分类号 |
G06F21/62 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method of dataset de-identification, the method comprising:
retrieving a dataset having a plurality of entries containing personal identifying information; identifying date quasi-identifiers in the dataset for each of the plurality of entries; performing consolidation of a plurality of date events in the date quasi-identifiers and connected dates in the dataset; performing de-identification of the plurality of entries in the dataset including the date quasi-identifiers; performing risk analysis of the de-identified dataset to determine a risk metric; and iteratively performing de-identification of the date quasi-identifiers until a defined risk threshold is met relative to the determined risk metric; and storing the de-identification dataset. |
地址 |
Ottawa ON CA |