OBJECT EXTRACTION IN COLOUR COMPOUND DOCUMENTS,申请号US20090637446-传众专利搜索

发明名称	OBJECT EXTRACTION IN COLOUR COMPOUND DOCUMENTS
摘要	Disclosed is a computer implemented method of text extraction in colour compound documents. The method connects similarly coloured pixels of an image of a colour compound document into connected components (CCs); classifies each CC as either text or non-text; refines the text CC classification for each text CC using global colour context statistics; groups text CCs into text blocks; recovers misclassified non-text CCs into a nearby text block; and removes extraneous CCs from each text block using local colour context statistics to thereby provide the extracted text in the text blocks. Also disclosed is a computer implemented method of locating graphics objects in a colour compound document image. The method connects similarly coloured pixels of said image into connected components (CCs) and placing the CCs in an enclosure tree; classifies (330,730) each CC into one of a plurality of classes wherein at least one class (862) represents salient graphics components; identifies (1140) a graphics container (441) to perform semantic analysis for each CC of said class representing salient graphics components; profiles (1170) descendents of said graphics container in said tree to obtain semantic context statistics; and decides (1710) whether the graphics container contains a whole or part of a graphics object based on said semantic context statistics.
申请公布号	US2010157340(A1)	申请公布日期	2010.06.24
申请号	US20090637446	申请日期	2009.12.14
申请人	CANON KABUSHIKI KAISHA	发明人	CHEN YU-LING;LIU PING;MCDONELL TREVOR LEE
分类号	G06F15/00;G06K9/00	主分类号	G06F15/00
代理机构		代理人
主权项
地址