主权项 |
1. A method for content item classification, comprising:
receiving an image for classification; generating, using at least one processor, a compact representation for the image by downsampling the received image, the compact representation having a reduced set of pixel values indicative of pixel values within the received image; identifying, using the at least one processor, a plurality of angle measurements for possible page edges of at least one potential document within the received image, wherein identifying the plurality of angle measurements for possible page edges comprises:
calculating a plurality of gradient values from the reduced set of pixel values;identifying, based on the plurality of gradient values, one or more edge candidates of the at least one potential document; andcalculating the plurality of angle measurements based on a vector extending from a selected origin to a point on each of the one or more edge candidates; determining, based on the identified plurality of angle measurements, that the image contains a document; and in response to determining that the image contains a document, classifying the image as a document containing image based on the identified plurality of angle measurements for possible page edges. |