发明名称 Verfahren zum automatischen Erkennen gesprochener Worte
摘要 981, 383. Identifying spoken words. INTERNATIONAL BUSINESS MACHINES CORPORATION. Aug. 28, 1961 [Aug. 29, 1960], No. 30960/61. Heading G4R. In a system for the recognition of spoken words means are provided to derive an electric signal representing the sound and circuits responsive to a number of selected properties of the signals which vary during the duration of the word and further circuits controlled by the time of operation of the first circuits to identify particular characteristics in the sound. A system arranged to recognise the spoken digits "zero" to "nine" consists of a microphone 20, Fig. 2 and amplifier 21 and six detector circuits 24-29 to which the signal is applied. The voicing detector 24 responds to an asymmetric characteristic found in the vocal chord sounds of speech. These sounds generally represent the vowel sounds as opposed to the frictional and other consonant sounds. The circuits 25-27 respond to specific vowel characteristics to distinguish particular words. Circuit 25 gives an output when the vowel sound of " one" is present but not when "nine" is present. The circuit 26 responds to the sound "four" but not "three" and circuit 27 distinguishes "two" from "seven" by giving an output only when "seven" is present. Two further circuits 28, 29 respond to strong frictional sounds (such as "s", hard "t" and "x") and weak frictional sounds (such as "f", "v" and soft "t"). The circuits 24-29 are each connected to relays in the "sound increment sequency register" 16. The relay contacts are interconnected as shown in Fig. 3 to obtain further signals; a "weak friction early" (k2), "strong friction early" (k3), "Voicing and friction" (k4) "Weak friction late" (k5) and "Strong friction late" (k6). Early and late indicate that the frictional sound comes before or after the voice sound. Contacts of the relays K1-K11 are connected in a network Fig. 4 to indicate the presence of particular combinations representing the ten digits. "Zero", for example gives a voicing and friction signal which comes from the "z" sound. Relays K1 and K4 give an output on the "zero" line in Fig. 4. Other digit words are identified in a similar way. Circuits 24 29: The voicing detector 24 measures the difference between the peak of the positive envelope of the word signals and the peak of the negative envelope. The signals are generally complex waves rather like damped oscillations. The signals are applied to a phase shifting circuit which passes all frequencies of interest. This consists of a transistor 50 having a network consisting of an adjustable resistor 60 and capacitor 61. The output is applied via a transformer 63 to oppositely poled diodes each having a capacitor 68, 73 and coupled to a junction point 70 through resistors. A voice signal produces an out-of-balance between the two capacitors 68, 73 and a corresponding signal output at terminal 70. The "m" and "n" sounds called "machine vowel sounds" give a balanced signal and no output at terminal 70. Adjustment of the resistor alters the response to different voicing sounds and may be used to distinguish between "three" and "four", the former giving a positive response and the second a negative. With another adjustment "one" and "nine" can be distinguished in the same way. By further adjustment a pulse of one polarity may be followed by an opposite pulse in response to particular conditions. These responses can be identified by suitable circuits, for example a multivibrator can be set by the pulse of first polarity and its output used to enable for a predetermined period a gate for the second pulse. The circuit 27 distinguishing "two" from "seven" comprises a high pass filter 100 Fig. 10 and a low pass filter 102, the outputs being applied through oppositely poled diodes to integrating circuits. The outputs are additively combined in resistor 122. The outputs for "two" and "seven" are of opposite polarity. The circuit 28 is shown in Fig. 8 consists of a high pass filter 80 (passing signals over 5000 cycles) the output of which is applied through adjustable resistor 81, diode 82 to integrating capacitor 84. A threshold device may be connected to respond to strong friction signals. Circuit 29 detecting weak friction sounds as shown in Fig. 9. The input signals are applied to a high gain clipper amplifier 87 to get a series of rectangular pulses which trigger a multivibrator 88 to give a series of short pulses one for each zero crossing the input signal. The rectifying and integrating circuit 90, 91, 93, 94 serves to measure the number of zero crossings occurring in a certain time period. An output of a certain value, detected by a threshold device, indicates a weak friction sound. Double vowel words:- The system may be extended to recognise double voice sound words by switching the first part of a word signal to a first register and after the detection of a machine syllable to switch the second part to a second register. The outputs of the two registers are combined to identify the word. The syllable detector may respond to the occurrence of a second voice sound signal.
申请公布号 DE1422040(A1) 申请公布日期 1971.09.30
申请号 DE19611422040 申请日期 1961.08.28
申请人 INTERNATIONAL BUSINESS MACHINES CORP. 发明人 C. DERSCH,WILLIAM
分类号 G10L15/00;G10L15/02 主分类号 G10L15/00
代理机构 代理人
主权项
地址