摘要 |
Disclosed is a machine translation method for a PDF file. A machine translation device extracts source language text and non-text from the input source language PDF file through image transformation, corrects the extracted source language text by using the source language text extracted from text information, restores a part that is contextually separated by the non-text from among the extracted source language text, generates a source language XML/HTML file by rearranging the extracted text and non-text so as to satisfy the contextual flow of the source language PDF file, separates source language text from a tag of the source language XML/HTML file, generates target language text by using translation knowledge and a transformation engine specified for the technical field corresponding to the source language PDF file, inserts the translated target language text other than source language text into XML/HTML file, and transforms the generated target language XML/HTML file into a target language PDF file to be output. |