摘要 |
<p>A method for creating a model capable of identifying one or more clusters in a healthcare multivariate dataset 202 comprises receiving an input pertaining to a range of numbers 204 (e.g. categories pertaining to medical conditions). Each number in the range of numbers is representative of a number of clusters 206 in the healthcare dataset. For a cluster, one or more first parameters of a distribution associated with the cluster are estimated. Thereafter, a threshold value is determined based on the one or more first parameters. An inverse cumulative distribution of each of one or more n-dimensional variables in the healthcare dataset is determined. The one or more first parameters are updated to generate one or more second parameters based on the estimated inverse cumulative distribution. A model is created for each number in the range of numbers based on the one or more second parameters. The embodiments of the invention provide a method for stratifying one or more patients in one or more categories based on a medical record data, such as one or more physiological markers of each of the one or more patients. A Gaussian copula mixture model (GCMM) may be utilised for identifying one or more clusters in the multivariate dataset.</p> |