发明名称 METHOD FOR THE AUTOMATED ANALYSIS OF TEXT DOCUMENTS
摘要 The invention relates to the automated analysis of text documents. When used in the development of new systems and the improvement of existing systems for checking text documents for the presence of phrases or portions of text from other documents, the invention makes it possible to increase the range of existing technical means by giving rise to a comparatively fast and versatile method which makes it possible to detect expressions, phrases or even passages in a document which come from other documents. The method for the automated analysis of text documents consists in: converting all electronic reference document files into a predetermined format while identifying meaningful fragments, referred to as clauses, in each document; saving the converted electronic reference document files in a database; converting each electronic analysis document file into a predetermined format; detecting the concurrence of clauses identified in an electronic analysis document file with clauses identified in the electronic reference document files; counting the relative number of clauses in the electronic analysis document file which coincide with corresponding clauses in each of the electronic reference document files; and comparing the relative numbers of concurrences with a predetermined threshold value for the detection of passages from any of the reference documents in the electronic analysis document file.
申请公布号 WO2013073999(A8) 申请公布日期 2014.08.28
申请号 WO2012RU00945 申请日期 2012.11.16
申请人 OBSHCHESTVO S OGRANICHENNOY OTVETSTVENNOST'YU "TSENTR INNOVATSIY NATAL'I KASPERSKOY" 发明人 LAPSHIN, VLADIMIR ANATOL'YEVICH;PSHEKHOTSKAYA, YEKATERINA ALEKSANDROVNA;PEROV, DMITRIY VSEVOLODOVICH
分类号 G06F17/20 主分类号 G06F17/20
代理机构 代理人
主权项
地址