摘要 |
Low-latency gesture detection is described, for example, to compute a gesture class from a live stream of image frames of a user making a gesture, for example, as part of a natural user interface controlling a game system or other system. In examples, machine learning components are trained to learn gesture primitives and at test time, are able to detect gestures using the learned primitives, in a fast, accurate manner. For example, a gesture primitive is a latent (unobserved) variable describing features of a subset of frames from a sequence of frames depicting a gesture. For example, the subset of frames has many fewer frames than a sequence of frames depicting a complete gesture. In various examples gesture primitives are learnt from instance level features computed by aggregating frame level features to capture temporal structure. In examples frame level features comprise body position and body part articulation state features. |