发明名称 System and method for text cleaning by classifying sentences using numerically represented features
摘要 A method and system for cleaning an electronic document are provided. The method comprises: identifying at least one sentence in the electronic document; numerically representing features of the sentence to obtain a numeric feature representation associated with the sentence; inputting the numeric feature representation into a machine learning classifier, the machine learning classifier being configured to determine, based on each numeric feature representation, whether the sentence associated with that numeric feature representation is a bad sentence; and removing sentences determined to be bad sentences from the electronic document to create a cleaned document.
申请公布号 US8380492(B2) 申请公布日期 2013.02.19
申请号 US20100775580 申请日期 2010.05.07
申请人 ROGERS COMMUNICATIONS INC.;XU LIQIN;LEE HYUN CHUL 发明人 XU LIQIN;LEE HYUN CHUL
分类号 G06F17/27;G06F17/20;G06F17/21 主分类号 G06F17/27
代理机构 代理人
主权项
地址