Data cleansing system and method,申请号US20070702811-传众专利搜索

首页产品黄页商标征信

会员服务注册登录

法人/股东/高管

发明名称	Data cleansing system and method
摘要	An automated system and method is provided for debugging training data used to train an automated language identifier. The system and method collects texts written in a particular language, generates an occurrence count for words in each text by counting the number of times each of the words is found within the text, and generates an occurrence ratio (OR) of each of the words by dividing the occurrence count by the total number of words in each text. Words are then filtered from the texts in which their occurrence ratios are substantially higher than their occurrence ratios in at least one of the other texts, to generate a clean text.
申请公布号	US7729899(B2)	申请公布日期	2010.06.01
申请号	US20070702811	申请日期	2007.02.06
申请人	BASIS TECHNOLOGY CORPORATION	发明人	OTSUKA NOBUO
分类号	G06F17/28;G06F17/27	主分类号	G06F17/28
代理机构		代理人
主权项
地址

您可能感兴趣的专利

TONER CARTRIDGE

HEAT DEVELOPABLE COLOR PHOTOSENSITIVE MATERIAL

MONOBATH PROCESSING AGENT FOR SILVER HALIDE PHOTOGRAPHIC SENSITIVE MATERIAL AND ITS PROCESSING METHOD

COLOR LIQUID CRYSTAL ELEMENT AND DISPLAY DEVICE

AUTOMATIC DEVELOPMENT PROCESSING DEVICE FOR PHOTOGRAPHY

OPTICAL FREQUENCY COMB GENERATION DEVICE

LIQUID CRYSTAL PANEL AND ITS DRIVING METHOD, AS WELL AS ON-VEHICLE DISPLAY DEVICE

LIQUID CRYSTAL DISPLAY DEVICE

OPTICAL CIRCULATOR

LIGHT EMITTING MODULE WORKING DEVICE AND LIGHT EMITTING MODULE WORKING METHOD

HIGH-STABILITY PHOTODETECTING DEVICE

ELEVATION SAMPLING DEVICE

METHOD AND DEVICE FOR CONSTRUCTING DATABASE FOR DETECTING ABNORMAL CURRENT OF WIRE HARNESS OF VEHICLE

METHOD FOR TESTING SEMICONDUCTOR DEVICE

MEASUREMENT OF ENDOTOXIN IN OILY SUBSTANCE

METHOD AND APPARATUS FOR MEASUREMENT OF VISCOSITY OF ELECTROVISCOUS FLUID