摘要 |
A method of identifying units of translation in a block of source content, so as to segment the block of content into the units of translation, includes selecting one or more delineating characteristics of the source content in addition to lexical characteristics. The method further includes determining instances of the delineating characteristics in the block of source content, and identifying pairs of the instances within the text. The method also includes, for each pair of instances of the delineating characteristics, associating a first instance of the pair with a first boundary of a unit of translation, and associating a second instance of the pair with a second boundary of the unit of translation. One embodiment further includes identifying target units of translation in a block of target content, and assigning associations among the source units of translation and the target units of translation.
|