发明名称 |
Text detection in video |
摘要 |
Techniques of detecting text in video are disclosed. In some embodiments, a portion of video content can be identified as having text. Text within the identified portion of the video content can be identified. A category for the identified text can be determined. In some embodiments, a determination is made as to whether the video content satisfies at least one predetermined condition, and the portion of video content is identified as having text in response to a determination that the video content satisfies the predetermined condition(s). In some embodiments, the predetermined condition(s) comprises at least one of a minimum level of clarity, a minimum level of contrast, and a minimum level of content stability across multiple frames. In some embodiments, additional information corresponding to the video content is determined based on the identified text and the determined category. |
申请公布号 |
US9036083(B1) |
申请公布日期 |
2015.05.19 |
申请号 |
US201414289142 |
申请日期 |
2014.05.28 |
申请人 |
Gracenote, Inc. |
发明人 |
Zhu Irene;Harron Wilson;Cremer Markus K. |
分类号 |
H04N7/00;H04N11/00;G06K9/72;H04N5/445 |
主分类号 |
H04N7/00 |
代理机构 |
Schwegman Lundberg & Woessner, P.A. |
代理人 |
Schwegman Lundberg & Woessner, P.A. |
主权项 |
1. A computer-implemented method comprising:
identifying, by a machine having a memory and at least one processor, a portion of video content as having text, the identifying comprising:
converting a frame of the video content to grayscale;performing edge detection on the frame;performing dilation on the frame to connect vertical edges within the frame;binarizing the frame;performing a connected component analysis on the frame to detect connected components within the frame;merging the connected components into a plurality of text lines;refining the plurality of text lines using horizontal and vertical projections;filtering out at least one of the plurality of text lines based on a size of the at least one of the plurality of text lines to form a filtered set of text lines;binarizing the filtered set of text lines; andfiltering out at least one of the text lines from the binarized filtered set of text lines based on at least one of a shape of components in the at least one of the text lines and a position of components in the at least one of the text lines to form the portion of the video content having text; identifying text within the identified portion of the video content; and determining a category for the identified text. |
地址 |
Emeryville CA US |