摘要 |
PROBLEM TO BE SOLVED: To construct a discriminator for highly accurately detecting a detection target such as a specific object or event from video on the basis of learning using diverse learning data.SOLUTION: A learning device 1 extracts a video section corresponding to a keyword showing a detection target of a discriminator to be constructed or a keyword related to the keyword, by using a text showing voice of input video data; and constructs the discriminator by using the extracted video section as learning data of a positive sample. When the constructed discriminator has insufficient accuracy, the learning device 1 repeats processing of constructing the discriminator by adding and modifying learning data.In order to generate diverse learning data in adding learning data, the learning device 1 mixes a predetermined ratio of video data of a video section selected randomly, video section having high audio and visual similarity to the positive sample, and video section detected by a discriminator that has completed learning corresponding to a semantically similar keyword, of video sections in the input video data, with the present learning data as positive examples. |