发明名称 Data Analysis Computer System and Method For Causal Discovery with Experimentation Optimization
摘要 Discovery of causal models via experimentation is essential in numerous applications fields. One of the primary objectives of the invention is to minimize the use of costly experimental resources while achieving high discovery accuracy. The invention provides new methods and processes to enable accurate discovery of local causal pathways by integrating high-throughput observational data with efficient experimentation strategies. At the core of these methods are computational causal discovery techniques that account for multiplicity (i.e., indistinguishability) of causal pathways consistent with observational data. The invention, when applied for discovery of local causal pathways from a combination of observational and experimental data, achieves higher discovery accuracy than existing observational approaches and uses fewer experimental resources than existing experimental approaches. Repeated application of the invention for each variable in the modeled system produces the full causal model.
申请公布号 US2014289174(A1) 申请公布日期 2014.09.25
申请号 US201414215877 申请日期 2014.03.17
申请人 Statnikov Alexander;Aliferis Konstantinos (Constantin) F. 发明人 Statnikov Alexander;Aliferis Konstantinos (Constantin) F.
分类号 G06N99/00 主分类号 G06N99/00
代理机构 代理人
主权项 1. A computer-implemented method and system for optimizing experimental manipulations for discovery of local causal pathways comprising the following steps: 1) applying Generalized Local Learning or another sound method to a dataset to create from the analysis dataset, a list of variables V that are members of the local causal pathway of the response variable T; 2) if the response variable T can be experimentally manipulated, a. experimentally manipulating T and obtaining experimental data, in other words providing post-manipulation measurements of all variables in V;b. marking all variables in the set V that change in the experimental data due to manipulation of T as “direct effects” and marking remaining variables in V as “direct causes”; 3) if the response variable T cannot be experimentally manipulated, repeating the following for all variables X in the set V; a. experimentally manipulating X and obtaining experimental data;b. if T changes in the experimental data due to manipulation of X, marking X as a “direct cause” and if T does not change marking X as “direct effect”; and 4) outputting the local causal pathway of T by identifying the causal role of each variable as either having a direct effect or a direct cause in the pathway.
地址 New York NY US