发明名称 METHOD AND SYSTEM FOR LARGE SCALE DATA CURATION
摘要 An end-to-end data curation system and the various methods used in linking, matching, and cleaning large-scale data sources. The goal of this system is to provide scalable and efficient record deduplication. The system uses a crowd of experts to train the system. The system operator can optionally provide a set of hints to reduce the number of questions sent to the experts. The system solves the problem of schema mapping and record deduplication in a holistic way by unifying these problems into a unified linkage problem.
申请公布号 EP3123362(A1) 申请公布日期 2017.02.01
申请号 EP20150768105 申请日期 2015.03.20
申请人 Tamr, Inc. 发明人 BATES-HAUS, Nikolaus;BESKALES, George;BRUCKNER, Daniel Meir;ILYAS, Ihad;PAGAN, Alexander Richter;STONEBRAKER, Michael Ralph
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址