发明名称 Method for aligning a text image to a transcription of the image
摘要 A method for establishing a relationship between a text image and a transcription associated with the text image uses conventional image processing techniques to identify one or more geometric attributes, or image parameters, of each of a sequence of regions of the text image. The transcription labels in the transcription are analyzed to determine a comparable set of parameters in transcription label sequence. A matching operation then matches the respective parameters of the two sequences to identify image regions that match with transcription regions. The result is an output data structure that minimally identifies image locations of interest to a subsequent operation that processes the text image. The output data structure may also pair each of the image locations of interest to a transcription location, in effect producing a set of labeled image locations. In one embodiment, the sequence of locations of words and their observed lengths in the text image are determined. The transcription is analyzed to identify words, and transcription word lengths are computed using an estimated image character width of glyphs in the text image. The sequence of observed image word lengths is then matched to the sequence of computed transcription word lengths using a dynamic programming algorithm that finds a best path through a two-dimensional lattice of nodes and transitions between nodes, where the transitions represent pairs of sequences of zero or more word lengths. An output data structure contains entries, each of which pairs a transcription word with a matching image word location.
申请公布号 US5689585(A) 申请公布日期 1997.11.18
申请号 US19950431004 申请日期 1995.04.28
申请人 XEROX CORPORATION 发明人 BLOOMBERG, DAN S.;NILES, LESLIE T.;KOPEC, GARY E.;CHOU, PHILIP ANDREW
分类号 G06K9/20;G06K9/72;(IPC1-7):G06K9/72 主分类号 G06K9/20
代理机构 代理人
主权项
地址