发明名称 Personally identifiable information detection
摘要 Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for privacy protection. In one aspect, a method includes accessing personally identifiable information (PII) type definitions that characterize PII types; identifying PII type information included in content of a web page, the PII type information being information matching at least one PII type definition; identifying secondary information included in the content of the web page, the secondary information being information that is predefined as being associated with PII type information; determining a risk score from the PII type information and the secondary information; and classifying the web page as a personal information exposure risk if the risk score meets a confidentiality threshold, wherein the personal information exposure risk is indicative of the web page including personally identifiable information.
申请公布号 US9015802(B1) 申请公布日期 2015.04.21
申请号 US201314024943 申请日期 2013.09.12
申请人 Google Inc. 发明人 Muthusrinivasan Muthuprasanna;Haahr Paul;Cutts Matthew D.
分类号 G06F21/00;G06F21/62;H04L29/06 主分类号 G06F21/00
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A method performed by data processing apparatus, the method comprising: accessing, by a data processing apparatus, personally identifiable information (PII) type definitions that characterize PII types; identifying, by the data processing apparatus, PII type information included in content of a web page, the PII type information being information matching at least one PII type definition; identifying a sup-portion of content of the web page, the sub-potion of content being content within a window that includes the PII type information and additional content and excluding other content of the web page; identifying, by the data processing apparatus, secondary information included in the sub-portion of content of the web page, the secondary information being content that matches information that is predefined as being associated with PII type information; determining a risk score from the PII type information and the secondary information; and classifying the web page as a personal information exposure risk if the risk score meets a confidentiality threshold, wherein the personal information exposure risk is indicative of the web page including personally identifiable information.
地址 Mountain View CA US