发明名称 METHOD OF IDENTIFYING PERSONAL DATA OF OPEN SOURCES OF UNSTRUCTURED INFORMATION
摘要 FIELD: information technology.SUBSTANCE: personal data identification is achieved through linguistic techniques, realised by a data collection server, a linguistic processing server and an application server. The disclosed method includes creating a task based on open source bypass parameters coming in through an administrator's automated workstation. Further, the method includes loading text, bypassing open sources and loading texts or transmitting texts from an external system; selecting links from the loaded texts for addition thereof to addresses for further bypass; extracting text and converting binary files to a text format; text prepared for analysis is broken down and the substance is determined; the substance of personal data in the text is selected; personal data are identified; facts (substance determined at the previous step associated with persons) of personal data in the text are identified.EFFECT: providing high relevance of results when identifying personal data in open information sources and in text files of the most common formats.7 cl, 3 dwg
申请公布号 RU2549515(C2) 申请公布日期 2015.04.27
申请号 RU20130140109 申请日期 2013.08.29
申请人 OBSHCHESTVO S OGRANICHENNOJ OTVETSTVENNOST'JU "MEDIALOGIJA" 发明人 KHUSNOJAROV FARIT FARITOVICH
分类号 G06F17/30;G06F17/20 主分类号 G06F17/30
代理机构 代理人
主权项
地址