摘要 |
<p>The present invention provides a model to identify the unknown CJKlanguage of a graph of Unicode combined text data by their UTF-16 values. Atfirst, the model goes through the text data content with its digital character value.According to the statistical analysis for each character, the model generates thepossibility for each CJK language from the text data. Then the model identifiesthe language by selecting the largest possibility most favorably.Figure 1</p> |