发明名称 METHODS AND SYSTEMS FOR ANALYZING HEALTHCARE DATA
摘要 Disclosed are the embodiments for creating a model capable of identifying one or more clusters in a healthcare dataset. An input is received pertaining to a range of numbers. Each number in the range of numbers is representative of a number of clusters in the healthcare dataset. For a cluster, one or more first parameters of a distribution associated with the cluster are estimated. Thereafter, a threshold value is determined based on the one or more first parameters. An inverse cumulative distribution of each of one or more n-dimensional variables in the healthcare dataset is determined. The one or more first parameters are updated to generate one or more second parameters based on the estimated inverse cumulative distribution. A model is created for each number in the range of numbers based on the one or more second parameters.
申请公布号 US2015227691(A1) 申请公布日期 2015.08.13
申请号 US201414179752 申请日期 2014.02.13
申请人 Xerox Corporation 发明人 Bhattacharya Sakyajit;Rajan Vaibhav
分类号 G06F19/00;G06N99/00;G06N7/00 主分类号 G06F19/00
代理机构 代理人
主权项 1. A method for creating a model capable of identifying one or more clusters in a healthcare dataset, the method comprising: receiving, by one or more processors, an input pertaining to a range of numbers, wherein each number in the range of numbers is representative of a number of clusters in the healthcare dataset; for a cluster in the number of clusters: estimating, by the one or more processors, one or more first parameters of a distribution associated with the cluster; estimating, by the one or more processors, an inverse cumulative distribution of each of one or more n-dimensional variables in the healthcare dataset based on a threshold value and a cumulative distribution of each of the one or more n-dimensional variables; updating, by the one or more processors, the one or more first parameters to generate one or more second parameters based on the estimated inverse cumulative distribution, wherein the updating is performed using an expectation-maximization algorithm; and creating, by the one or more processors, the model for each number in the range of numbers based on the one or more second parameters associated with each cluster in the number of clusters.
地址 Norwalk CT US