发明名称 Method and apparatus for detecting sensitive content in a document
摘要 One embodiment of the present invention provides a system that detects sensitive content in a document. In doing so, the system receives a document, identifies a set of terms in the document that are candidate sensitive terms, and generates a combination of terms based on the identified terms that is associated with a semantic meaning. Next, the system performs searches through a corpus based on the combination of terms and determines hit counts returned for each term in the combination and for the combination. The system then determines whether the combination of terms is sensitive based on the hit count for the combination and the hit counts for the individual terms in the combination, and generates a result that indicates portions of the document which contain sensitive combinations.
申请公布号 US8271483(B2) 申请公布日期 2012.09.18
申请号 US20080208091 申请日期 2008.09.10
申请人 STADDON JESSICA N.;CHOW RICHARD;DE PAIVA VALERIA;GOLLE PHILIPPE J. P.;FANG JI;KING TRACY HOLLOWAY;PALO ALTO RESEARCH CENTER INCORPORATED 发明人 STADDON JESSICA N.;CHOW RICHARD;DE PAIVA VALERIA;GOLLE PHILIPPE J. P.;FANG JI;KING TRACY HOLLOWAY
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址