摘要 |
In one embodiment, a computer system stores speech data for a plurality of speakers, where the speech data includes a plurality of feature vectors and, for each feature vector, an associated sub-phonetic class. The computer system then builds, based on the speech data, an artificial neural network (ANN) for modeling speech of a target speaker in the plurality of speakers, where the ANN is configured to discriminate between instances of sub-phonetic classes uttered by the target speaker and instances of sub-phonetic classes uttered by other speakers in the plurality of speakers. |