发明名称 Content-based document image classification
摘要 Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for classifying one or more document images based on its content by determining blocks layout of the document image; recognizing the document image to obtain digital content data representing text content or the potential graphical content of the image; calculating feature values of the document image for features based on the digital content data and the blocks layout; and classifying the document image as belonging to one of document classes based on the calculated feature values.
申请公布号 US9626555(B2) 申请公布日期 2017.04.18
申请号 US201414571766 申请日期 2014.12.16
申请人 ABBYY DEVELOPMENT LLC 发明人 Smirnov Anatoly;Panferov Vasily;Isaev Andrey
分类号 G06K9/00 主分类号 G06K9/00
代理机构 Lowenstein Sandler LLP 代理人 Lowenstein Sandler LLP
主权项 1. A method for classifying a document image based on its content using a processor device, comprising: accessing a set of features stored in memory; analyzing the document image to determine blocks layout; recognizing the document image to obtain digital content data representing text content or potential graphical content; calculating, based on one or more features from the set of features accessed in the memory, feature values of the document image for the one or more features from the set of features, wherein the feature values are based on the digital content data and the blocks layout; and classifying the document image as belonging to a document class from a set of document classes based on the calculated feature values.
地址 Moscow RU