摘要 |
The described implementations relate to automated data cleanup. One system includes a language model generated from language model seed text and a dictionary of possible data substitutions. This system also includes a transducer configured to cleanse a corpus utilizing the language model and the dictionary.
|