发明名称 Efficient development of a rule-based system using crowd-sourcing
摘要 Described herein are methods, systems, apparatuses and products for efficient development of a rule-based system. An aspect provides a method including accessing data records; converting said data records to an intermediate form; utilizing intermediate forms to compute similarity scores for said data records; and selecting as an example to be provided for rule making at least one record of said data records having a maximum dissimilarity score indicative of dissimilarity to already considered examples.
申请公布号 US8949204(B2) 申请公布日期 2015.02.03
申请号 US201213597589 申请日期 2012.08.29
申请人 International Business Machines Corporation 发明人 Chaturvedi Snigdha;Faruquie Tanveer Afzal;Subramaniam L. Venkata
分类号 G06F17/00;G06F17/30 主分类号 G06F17/00
代理机构 Ference & Associates LLC 代理人 Ference & Associates LLC
主权项 1. A method of data cleansing, said method comprising: utilizing at least one processor to execute computer code configured to perform the steps of: accessing data records; converting said data records to an intermediate form; utilizing intermediate forms of said data records to compute similarity scores of individual ones of said data records with respect to one another; from among said data records, providing at least one example record for rule making; and thereafter selecting from among said data records at least one additional example record for rule making; the additional example record comprising at least one record presenting at least one similarity score which indicates a least similarity with respect to the at least one example record already provided; the at least one example record and the at least one additional example record comprising a rule set; and employing a difficulty method to select from among said data records at least one training instance for updating the rule set; the selected at least one training instance comprising at least one example record presenting at least one similarity score which indicates a least similarity with respect to at least one example record in the rule set.
地址 Armonk NY US
您可能感兴趣的专利