发明名称 Predicting data unavailability and data loss events in large database systems
摘要 Data unavailability and data loss events in a large distributed database system are predicted by proactively and substantially continuously collecting information about appliance states and operations in the database system, forming feature vectors of prescribed key information features, and classifying said feature vectors as indicative of possible DU/DL events based upon their similarity and closeness to stored historical feature vectors known to be relevant to DU/DL events.
申请公布号 US9489379(B1) 申请公布日期 2016.11.08
申请号 US201213723134 申请日期 2012.12.20
申请人 EMC Corporation 发明人 Wu Ben;Lin Derek;Chaudhary Deepesh;Petrov Lubomir P.;Volkov Sagy
分类号 G06F17/30;G06F17/28 主分类号 G06F17/30
代理机构 代理人 Young Barry N.
主权项 1. A method of predicting data unavailability and data loss (DU/DL) events in a database system, comprising: collecting current state data about database appliances and the database system, said state data comprising unstructured textual records comprising natural language text relating to the operating states of database appliances and the database system; forming a first data set containing the collected current state data, said first data set characterizing the database system at the time the state data was collected; processing said unstructured textual records of said first data set directly in unstructured form using machine learning to produce a first numerical score for said first data set; analyzing directly said unstructured textual records of said first data set in said unstructured form with respect to a historical second data set that is relevant to a previous DU/DL event using said first numerical score and a second numerical score for unstructured textual records in unstructured form of the second data set to identify related conditions, wherein said unstructured textual records of said first and second data sets comprise textual information in service requests about problems; and predicting using machine learning a numerical probability of a DU/DL event occurring in said database system based upon said related conditions.
地址 Hopkinton MA US