摘要 |
Video material is dividing into temporal segments. Each segment is examined to determine whether the soundtrack of the segment contains speech sufficient for analysis and if so, metadata are generated based on analysis of the speech. If not, the segment is analysed by comparing frames thereof with those of stored segments that already have metadata assigned to them. One then assigns to the segment under consideration stored metadata associated with one or more stored segments that are similar.
|