主权项 |
1. A method of neighbor embedding comprising the steps of:
creating a training of pairs, each having a first element, FLn,m, and a second element, H*n,m; creating a set of randomized k-d trees from said training set by using said first elements, FLn,m, as leaf nodes; receiving at least one low resolution text image, P; generating a high resolution image, T, from said at least one low resolution text image, P, using said training set and the result of step (b); and generating a text file by inputting said high resolution image, T, into an OCR engine; wherein the step of creating a training set comprises:
a. receiving a user-defined set of text documents, Sn;b. generating, for each Sn, a high resolution image, Hn;c. generating, for each Sn, a low resolution image, Ln;d. selecting a user-definable patch dimension, K, where K is a user-definable small positive integer;e. partitioning said low resolution image, Ln, into a set of K×K patches, Ln,m each having K×K pixel values;f. setting J=NK, where both N is a user-definable small positive integer;g. partitioning said high resolution image, Hn, into a set of J×J patches, Hn,m, each having J×J pixel values, such that said K×K patch Ln,m and said J×J patch Hn,m both correspond to a same region in Sn;h. generating a feature vector, FLn,m, having K×K components, for each said K×K patch, Ln,m; andi. generating a J×J high resolution patch, H*n,m, for each said J×J patch, Hn,m, by subtracting a mean pixel value of said corresponding K×K patch, Ln,m, from each of said number of pixel values in said J×J patch, Hn,m. |