发明名称 Method and apparatus for removing redundant information from digital documents
摘要 Method and apparatus for reconstructing new documents from a group of old ones by removing the existing redundant information. Redundant information (images, text paragraphs) from retrieved multimedia documents is removed. Each document consists of two main parts stored in different databases. The first part of a document represents text paragraphs, the second part consists of the images and drawings related with the text paragraphs. An information reduction methodology examines first the text paragraphs of each document related with a specific topic, and removes the redundant information, such as same or similar paragraphs, by keeping pointers useful for a future reconstruction of the original documents. The remaining text paragraphs and the set of points are used to compose the first version of a new document. The invention also examines all the images related with the set of original documents and removes the same or similar images while keeping pointers that could assist a future reconstruction of the original documents. The invention merges text-paragraphs and images and creates the first stage new document.
申请公布号 US7017113(B2) 申请公布日期 2006.03.21
申请号 US20020314189 申请日期 2002.12.05
申请人 THE UNITED STATES OF AMERICA AS REPRESENTED BY THE SECRETARY OF THE AIR FORCE 发明人 BOURBAKIS NICHOLAS G.;BOREK STANLEY E.
分类号 G06F17/00;G06F17/27;G06F17/30 主分类号 G06F17/00
代理机构 代理人
主权项
地址