发明名称 Data mining technique with diversity promotion
摘要 Roughly described, a computer-implemented evolutionary data mining system includes a memory storing a candidate gene database in which each candidate individual has a respective fitness estimate; a gene pool processor which tests individuals from the candidate gene pool on training data and updates the fitness estimate associated with the individuals in dependence upon the tests; and a gene harvesting module for deploying selected individuals from the gene pool, wherein the gene pool processor includes a competition module which selects individuals for discarding in dependence upon both their testing experience level and a diversity measure of individuals in the gene pool.
申请公布号 US8977581(B1) 申请公布日期 2015.03.10
申请号 US201213540507 申请日期 2012.07.02
申请人 Sentient Technologies (Barbados) Limited 发明人 Hodjat Babak;Shahrzad Hormoz
分类号 G06N3/12 主分类号 G06N3/12
代理机构 Haynes Beffel & Wolfeld LLP 代理人 Haynes Beffel & Wolfeld LLP ;Wolfeld Warren S.
主权项 1. A computer-implemented data mining system, for use with a data mining training database containing training data, comprising: a memory storing a candidate gene database having a pool of candidate individuals, each candidate individual identifying a plurality of conditions and at least one corresponding proposed output in dependence upon the conditions, each candidate individual further having associated therewith a respective testing experience level and an indication of a respective fitness estimate, wherein the memory further identifies layer parameters for each of a plurality of gene pool experience layers L1-LT in an elitist pool, T>1, the layer parameters for each i'th one of the layers L1-LT-1 identifying a range of testing experience [ExpMin(Li) . . . ExpMax(Li)],and wherein each ExpMin(Li)>ExpMax(Li−1) for i>1; a gene pool processor which: tests individuals from the candidate gene pool on the training data, each individual being tested undergoing a respective battery of at least one trial, each trial applying the conditions of the respective individual to the training data to propose an output, andupdates the fitness estimate associated with each of the individuals being tested in dependence upon both the training data and the outputs proposed by the respective individual in the battery of trials; and a gene harvesting module providing for deployment selected ones of the individuals from the gene pool, wherein the gene pool processor includes a competition module which selects individuals for discarding from the gene pool in dependence upon both their testing experience level and a diversity measure of individuals in the gene pool, and wherein the diversity measure of individuals in the gene pool comprises a first value being a diversity measure of only those individuals having an experience level within a first one of the experience layers and a second value being a diversity measure of only those individuals having an experience level within a second one of the experience layers.
地址 Belleville BB