摘要 |
A computer-implemented technique can include receiving, at a server including one or more processors, a source word in a source language. The technique can include determining, at the server, one or more potential translations for the source word in a target language different than the source language. The technique can include determining, at the server, one or more synonyms for each of the one or more potential translations to obtain a plurality of potential translations. The technique can include determining, at the server, one or more translation clusters using the plurality of potential translations and a clustering algorithm. Each translation cluster can contain all of the plurality of potential translations that have a similar denotation and each of the plurality of translations that have a similar denotation can be included in a specific translation cluster. The technique can also include outputting, at the server, the one or more translation clusters. |
主权项 |
1. A computer-implemented method, comprising:
receiving, at a server from a computing device via a network, the server including one or more processors, a single source word in a source language, wherein the single source word is input by a user at the computing device; determining, at the server, one or more potential translations for the single source word in a target language different than the source language; determining, at the server, one or more synonyms for each of the one or more potential translations to obtain a plurality of potential translations, wherein the synonyms are stored in a datastore, and wherein the datastore can be accessed via a network; generating, at the server, one or more translation clusters using the plurality of potential translations and a first clustering algorithm and without using a context of the single source word, each translation cluster containing all of the plurality of potential translations that have a similar denotation each of the plurality of potential translations that have a similar denotation are included in a specific translation cluster, each translation cluster including at least one distinct potential translation of the plurality of potential translations, the one or more translation clusters collectively including all of the plurality of potential translations; and outputting, from the server to the computing device via the network, information based on the one or more translation clusters, wherein the first clustering algorithm is defined as: ← {C ∩ Ts : C ∈ ∪t∈Ts t} ← ∅for ∈ doif ′ ∈ such that ⊂ ′ thenadd to return where TS represents the plurality of potential translations, C represents a synonym set including a set of target-language words, t represents a set of synonym sets in which a specific potential translation t appears, B represents a source-specific synonym set, which is a subset of TS, represents a set of source-specific synonym sets, and represents the one or more translation clusters for TS. |