发明名称
摘要 1,255,834. Speech recognition. STANDARD TELEPHONES & CABLES Ltd. 30 April, 1969, No. 21952/69. Heading G4R. In a speech recognition system, a signal representing an acoustic input is decomposed into analogue feature signals on parallel channels which are transformed into digital time-ordered event markers (specifying feature occurrences) on parallel channels, further such markers (specifying occurrences of sequences of the feature occurrence markers) being generated, binary information representing both the features and sequences of features being stored in a fixed predetermined sequence. The amplified speech waveform, after logarithmic compression 102, is fed to high-pass filter 103, low-pass filter 104 and total energy detector 105, each including rectification and smoothing. Balance circuit 106 raises an output if the ratio of high to low frequency energy exceeds a first threshold and drops it if the ratio falls below a second (lower) threshold, has another output similarly catering for the inverse ratio, and has a " both " output raised on balance between the high and low frequency energy contents. The total energy detector 105 raises an output if the total energy exceeds a first threshold, and drops it if the total energy falls below a second (lower) threshold. The four outputs mentioned feed respective feature time-continuity filters FTCF 1, 2, 3, 4, the " both " output from 106 being inhibited at NOR 3 by " silence " from 105. Each FTCF raises its output when its input has been up for a predetermined time and drops it when it has been down for a predetermined time, and includes delaying so that the FTCF's present their outputs simultaneously. The duration-significant pulses from the FTCF's go to respective ternary event detectors TED 1, 2, 3, 4, each of which has 3 outputs which respectively deliver a pulse (a) at the start of the input pulse, (b) if the input pulse lasts less than a predetermined time, (c) if the input pulse lasts longer than the predetermined time, both the last 2 outputs delivering a pulse if the input pulse duration is close to the predetermined time. A collection of elementary sequence elements (not shown), each receive two inputs from the TED's (or each other) to produce a pulse on a first output if the first input is pulsed before the second, or on a second output if the second input is pulsed before the first, the element remembering which input was pulsed last (if any) to produce the appropriate output if the other input is pulsed again next, but pulses can be ORed into a " sequencebreaker " input to the element to reset the element and break the sequence. Outputs of the elementary sequence elements are stored to control recognition logic via a plugboard. A controller (not shown) responds to gates 10, 11, 12, fed from TED 4, as follows. A " beginning of gap " signal from gate 10, not followed within a predetermined time by a signal from gate 11, causes degating in Fig. 4 by a " freeze level " and enabling of system output. However, the latter is prevented if a " start " bi-stable was not set by an " end of long gap " signal (normally produced at word start) from gate 12 and an error indication is given instead (indicating too much noise preceding the word, or that the speaker started. speaking beforethe system " unfroze " from the last operation). In any event the system is subsequently automatically reset (or this may be done manually). Lamps are used for display at a number of points. The speech input may be accepted, rejected or a repeat requested, on the basis of a likelihood assessment of the features fed to the plugboard.
申请公布号 FR2047104(A5) 申请公布日期 1971.03.12
申请号 FR19700015630 申请日期 1970.04.29
申请人 INTER STANDARD ELECTRIC 发明人
分类号 G06F3/16;G10L15/00;G10L15/28;(IPC1-7):G10L1/00 主分类号 G06F3/16
代理机构 代理人
主权项
地址