Visual speech detection using facial landmarks,申请号US201313839655-传众专利搜索

发明名称	Visual speech detection using facial landmarks
摘要	A data processing apparatus for detecting a probability of speech based on video data is disclosed. The data processing apparatus may include at least one processor, and a non-transitory computer-readable storage medium including instructions executable by the at least one processor, where execution of the instructions by the at least one processor causes the data processing apparatus to execute a visual speech detector. The visual speech detector may be configured to receive a coordinate-based signal. The coordinate-based signal may represent movement or lack of movement of at least one facial landmark of a person in a video signal. The visual speech detector may be configured to compute a probability of speech of the person based on the coordinate-based signal.
申请公布号	US9190061(B1)	申请公布日期	2015.11.17
申请号	US201313839655	申请日期	2013.03.15
申请人	Google Inc.	发明人	Shemer Mikhal
分类号	G10L15/25;G10L25/78;G06K9/78	主分类号	G10L15/25
代理机构	Brake Hughes Bellermann LLP	代理人	Brake Hughes Bellermann LLP
主权项	1. A data processing apparatus for detecting a probability of speech based on video data, the data processing apparatus comprising: at least one processor; a non-transitory computer-readable storage medium including instructions executable by the at least one processor, wherein execution of the instructions by the at least one processor causes the data processing apparatus to execute: a visual speech detector configured to receive a coordinate-based signal, the coordinate-based signal representing movement or lack of movement of at least one facial landmark of a person in a video signal; the visual speech detector configured to calculate a short-term value representing short-term characteristics of the coordinated-based signal and a long-term value representing long-term characteristics of the coordinate-based signal, the visual speech detector configured to compute a probability of speech of the person based on a comparison of the short-term value and the long-term value, wherein, when the short-term value is greater than the long-term value, the visual speech detector computes the probability of speech as a value indicating that speech as occurred.
地址	Mountain View CA US