主权项 |
1. A transliteration device, comprising:
a generator generating, from a training set including multiple transliteration pairs consisting of an original spelling string spelled in any original language and a target spelling string transliterated from the original spelling string and spelled in a given target language and at least including original spelling strings of J original languages, K rewriting tables corresponding to K different implicit languages and including multiple sets of an original segment constituting said original spelling string, wherein J is a natural number and is greater than or equal to 2, and K is a natural number less than or equal to J, a transliterated segment constituting said target spelling string, and a rewriting probability that the original segment is rewritten as the transliterated segment for transliteration, and K transliteration tables corresponding to said K implicit languages and including multiple transliteration pairs included in said training set; and an updater calculating, for each of multiple transliteration pairs included in said training set, a transliteration probability that the original spelling string of the transliteration pair is transliterated to the target spelling string of the transliteration pair when the original spelling string originates from the implicit language corresponding to the rewriting table using the rewriting probabilities included in said K rewriting tables, saving the transliteration probability in the transliteration table corresponding to the implicit language in association with the transliteration pair, so updating the rewriting probabilities included in said K rewriting tables as to maximize an expected value, which is calculated using the transliteration probability, of a likelihood function calculating a likelihood presenting how likely said K transliteration tables are when said training set is obtained, and repeating said calculation of the transliteration probabilities and said update of the rewriting probabilities. |