发明名称 MULTILINGUAL DOCUMENT-SIMILARITY-DEGREE LEARNING DEVICE, MULTILINGUAL DOCUMENT-SIMILARITY-DEGREE DETERMINATION DEVICE, MULTILINGUAL DOCUMENT-SIMILARITY-DEGREE LEARNING METHOD, MULTILINGUAL DOCUMENT-SIMILARITY-DEGREE DETERMINATION METHOD, AND STORAGE MEDIUM
摘要 This invention provides a technology for searching for similar documents in a multilingual document group at lower cost and with higher precision, even if three or more languages are present. This multilingual document-similarity-degree learning device (1) comprises the following: a multilingual matrix storage unit (11) that holds a matrix for each target language; a word-vector acquisition unit (12) that acquires a word vector corresponding to a document; a meaning-vector creation unit (13) that creates a meaning vector for said document on the basis of the word vector for said document and the matrix corresponding to the language in which said document is written; a similarity-degree calculation unit (14) that calculates similarity degrees on the basis of meaning vectors for documents in a document group; and a multilingual matrix learning unit (15) that implements learning by adjusting values in the matrices corresponding to the respective target languages such that, within a set of documents each written in one of the target languages, the similarity degrees for groups of documents that exhibit source-translation relationships are higher than the similarity degrees for groups of documents that do not exhibit source-translation relationships.
申请公布号 WO2015145981(A1) 申请公布日期 2015.10.01
申请号 WO2015JP01028 申请日期 2015.02.27
申请人 NEC CORPORATION 发明人 SADAMASA, KUNIHIKO
分类号 G06F17/30;G06F17/27 主分类号 G06F17/30
代理机构 代理人
主权项
地址