主权项 |
1. A system for localizing landmarks on face images, the system comprising:
an input for receiving a face image; an output for presenting landmarks identified by the system; and a plurality of neural network levels coupled in a cascade from the input to the output; wherein each neural network level produces an estimate of landmarks that is more refined than an estimate of landmarks of a previous neural network level, wherein the plurality of neural network levels comprise:
at least three cascaded neural network levels for predicting inner points defining landmarks within a face of the face image, the at least three cascaded neural network levels including the following in order from input to output:
a first bounding box estimator that receives the face image as input and produces a first cropped face image as output, the first cropped face image estimating a location of the face within the face image for purposes of estimating inner points,a first initial prediction module that receives the first cropped face image as input and produces a first landmarked face image as output, the first landmarked face image containing an initial prediction of inner points within the face image, andfor each of the landmarks to be predicted, a component refinement module that receives the first landmarked face image as input and produces a landmarked component image as output, the landmarked component image containing a refined estimate of inner points defining the landmark, andtwo cascaded neural network levels for predicting outer points defining a contour of the face of the face image, the two cascaded neural network levels including the following in order from input to output:
a second bounding box estimator that receives the face image as input and produces a second cropped face image as output, the second cropped face image estimating a location of the face within the face image for purposes of estimating outer points, anda second initial prediction module that receives the second cropped face image as input and produces a second landmarked face image as output, the second landmarked face image containing a prediction of outer points within the face image. |