发明名称 Decision tree refinement
摘要 A model refinement system refines initial split rules that define an initial decision tree to generate final split-rules. The model refinement refines the initial split rules by removing clauses that are satisfied by match scores that are less than a threshold match score to generate initial trimmed rules. Using the initial trimmed rules, the model refinement system classifies an initial training set and filters the initial training set to remove negative training pairs that are classified as duplicate pairs resulting in a filtered training set. An intermediate decision tree defined by intermediate split-rules is generated based on the filtered training set. Final split-rules are generated based on the intermediate split rules and input pairs of data records are classified as duplicate pairs based on attribute values of the input pairs and the final split-rules.
申请公布号 US8417654(B1) 申请公布日期 2013.04.09
申请号 US201213551779 申请日期 2012.07.18
申请人 CAO ZHEN;VERMA NAVAL;GOOGLE INC. 发明人 CAO ZHEN;VERMA NAVAL
分类号 G06F15/18;G06E1/00 主分类号 G06F15/18
代理机构 代理人
主权项
地址