发明名称 |
Privacy-preserving text to image matching |
摘要 |
A method for text-to-image matching includes generating representations of text images, such as license plate images, by embedding each text image into a first vectorial space with a first embedding function. With a second embedding function, a character string, such as a license plate number to be matched, is embedded into a second vectorial space to generate a character string representation. A compatibility is computed between the character string representation and one or more of the text image representations to identify a matching one. The compatibility is computed with a function that uses a transformation which is learned on a training set of labeled images. The learning uses a loss function that aggregates a text-to-image-loss and an image-to-text loss over the training set. The image-to-text loss penalizes the transformation when it correctly ranks a pair of character string representations, given an image representation corresponding to one of them. |
申请公布号 |
US9367763(B1) |
申请公布日期 |
2016.06.14 |
申请号 |
US201514594321 |
申请日期 |
2015.01.12 |
申请人 |
XEROX CORPORATION |
发明人 |
Gordo Soldevila Albert;Perronnin Florent C. |
分类号 |
G06K9/00;G06K9/62;G06K9/18;G06K9/52;G06K9/32;G06F17/30;G06F21/60;G06F17/11 |
主分类号 |
G06K9/00 |
代理机构 |
Fay Sharpe LLP |
代理人 |
Fay Sharpe LLP |
主权项 |
1. A method for text-to-image matching comprising:
storing a set of text image representations, each text image representation having been generated by embedding a respective text image into a first vectorial space with a first embedding function; with a second embedding function, embedding a character string into a second vectorial space to generate a character string representation; for each of at least some of the text image representations, computing a compatibility between the character string representation and the text image representation, comprising computing a function of the text image representation, character string representation, and a transformation, the transformation having being derived by minimizing a loss function on a set of labeled training images, the loss function including a text-to-image-loss and an image-to-text loss; and identifying a matching text image based on the computed compatibilities, wherein at least one of the embedding and the computing of the compatibility is performed with a processor. |
地址 |
Norwalk CT US |