发明名称 Computer Implemented Method for Discovery of Markov Boundaries from Datasets with Hidden Variables
摘要 Methods for Markov boundary discovery are important recent developments in pattern recognition and applied statistics, primarily because they offer a principled solution to the variable/feature selection problem and give insight about local causal structure. Currently there exist two major local method families for identification of Markov boundaries from data: methods that directly implement the definition of the Markov boundary and newer compositional Markov boundary methods that are more sample efficient and thus often more accurate in practical applications. However, in the datasets with hidden (i.e., unmeasured or unobserved) variables compositional Markov boundary methods may miss some Markov boundary members. The present invention circumvents this limitation of the compositional Markov boundary methods and proposes a new method that can discover Markov boundaries from the datasets with hidden variables and do so in a much more sample efficient manner than methods that directly implement the definition of the Markov boundary. In general, the inventive method transforms a dataset with many variables into a minimal reduced dataset where all variables are needed for optimal prediction of some response variable. The power of the invention was empirically demonstrated with data generated by Bayesian networks and with 13 real datasets from a diversity of application domains.
申请公布号 US2011202322(A1) 申请公布日期 2011.08.18
申请号 US20100689944 申请日期 2010.01.19
申请人 发明人 STATNIKOV ALEXANDER;ALIFERIS KONSTANTINOS (CONSTANTIN) F.
分类号 G06F17/10 主分类号 G06F17/10
代理机构 代理人
主权项
地址