发明名称 Relabelling of tokenized symbols in fontless structured document image representations
摘要 A processor is provided with a first set of digital information that includes a first structured representation of a document. From the first set of digital information, the processor produces a second set of digital information that includes a second structured representation of the document. The second structured representation is a lossless representation and includes a set of tokens and a set of positions. At least one token in the plurality of tokens has an associated semantic label which may be a character code associated with various font types in the second structured representation of the document. The semantic label may be obtained and stored in the second structured representation of the document by a computer program. The first and second representations may be resolution dependent structured representations and have, respectively, first and second characteristic resolutions. The first representation, but not the second, is provided in digital form to an untrusted recipient. A search for particular content of the second representation, including semantic labels, is requested by the recipient. A highlighted version of the first representation of the document is then provided to the recipient.
申请公布号 US2001043349(A1) 申请公布日期 2001.11.22
申请号 US20010884418 申请日期 2001.06.18
申请人 XEROX CORPORATION 发明人 BOBROW DANIEL G.;HUTTENLOCHER DANIEL P.;RUCKLIDGE WILLIAM J.;BROWN JOHN SEELY
分类号 G06F3/12;G06F13/00;G06F17/21;G06K15/02;G06T11/00;G06T11/60;(IPC1-7):B41B1/00 主分类号 G06F3/12
代理机构 代理人
主权项
地址