发明名称 Verfahren und Anordnung zur statistischen Fehlerreduzierung in Zeichenerkennungsvorrichtungen
摘要 <p>1,236,455. Character recognition. INTERNATIONAL BUSINESS MACHINES CORP. 30 Aug., 1968 [8 Sept., 1967], No. 41418/68, Heading G4R. Word classifying apparatus for use with a character reader comprises an error word generator for generating, from an input word, error words into which the reader might change the input word when reading, and the probability of each change, a ratio calculator for calculating the ratio of the frequency of the usage of an error word as a legitimate word to the probability of an input word being changed into it, and classifying means for classifying each input word and error word in accordance with the output of the ratio calculator. A confusion pair file 12 holds a series of pairs of letters which a character reader is likely to confuse, i.e. recognize the-first of the pair as the second, each pair being accompanied by the probability P of the confusion. Each name in a file 10 of common names is taken in turn and each letter in turn is compared against the first letter of each pair from file 12. On equality, an error name is generated from the common name by replacing the letter giving equality with the second letter of the pair. The error name is compared at 16 with the names in a file 18 to determine if it is a legitimate name in its own right, and if it is, a ratio calculator 20 calculates the ratio of N L , the number of occurrences of the error name as a legitimate name in a population, read from file 18, over N E , the number of times the error name would be produced in mistake for the common name. N E is obtained by multiplying the probability P of letter confusion, from file 12, by the number N c of occurrences of the common name in the population, from file 10. In order to use the above results to replace some names from a character reader by statistically more likely names before feeding them to an output, and mark all output names either " accept " or " reject ", the error names are sent to a file 26 via a register 24, each error name being followed by the corresponding common name from file 10 if replacement of the former by the latter will be required. Each common name from file 10 is also sent. The names are accompanied by " replace " and " accept/reject " tag bits set by a classifier 22 under control of the ratio &c. to indicate: (a) where a common name has a corresponding error name but the latter is not a legitimate name in its own right, that the error name is to be replaced by the common name and the output marked " accept ", (b) where a common name has a corresponding error name which is a legitimate name in its own right, that the error name is to be replaced by the common name and the output marked " accept " if the ratio is less than or equal to 0À05, replaced by the common name and the output marked " reject " if the ratio is greater than 0À05 but less than or equal to 1, the output marked " reject " if the ratio is over 1 and less than 20, and the output marked 'accept" if the ratio is greater than or equal to 20, (c) a common name is to be marked " accept " at the output. A name from a character reader is compared at 30 with each name from file 26 in turn (excluding those included in the file as names to be changed to) until equality, when the name in file 26 giving the equality or the following file name (according to the former name's " replace " tag bit) is passed to an output register 36 together with its " accept/reject " tag bit. If no name in the file 26 matches the name from the character reader, the latter name is passed to register 36 and marked " accept " or " reject " according as the name equals one in an uncommon name file 38 or not, as determined by comparisons at 30.</p>
申请公布号 DE1774782(A1) 申请公布日期 1972.01.20
申请号 DE19681774782 申请日期 1968.09.05
申请人 INTERNATIONAL BUSINESS MACHINES CORP. 发明人 FOSDICK,THERON;HAMBURGEN,ARTHUR;BAILY HENNIS,ROBERT
分类号 G06K9/72 主分类号 G06K9/72
代理机构 代理人
主权项
地址