发明名称 Re-identification risk in de-identified databases containing personal information
摘要 A system and method of performing risk assessment of a dataset de-identified from a source database containing information identifiable to individuals is provided. The de-identified dataset is retrieved comprising a plurality of records from a storage device. A selection of variables from a user is received, the selection made from a plurality of variables present in the dataset, wherein the variables are potential identifiers of personal information. A selection of a risk threshold acceptable for the dataset from a user is received. A selection of a sampling fraction wherein the sampling fraction define a relative size of their dataset to an entire population is received. A number of records from the plurality of records for each equivalence class in the identification dataset for each of the selected variables. A re-identification risk using the selected sampling fraction is calculated. The re-identification risk meets the selected risk threshold is determined.
申请公布号 US2010077006(A1) 申请公布日期 2010.03.25
申请号 US20090564687 申请日期 2009.09.22
申请人 UNIVERSITY OF OTTAWA 发明人 EL EMAM KHALED;DANKAR FIDA
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址