发明名称 Framework for calculating grouped optimization algorithms within a distributed data store
摘要 A framework for executing iterative grouped optimization algorithms such as machine learning and other analytic algorithms directly on unsorted data within a SQL data store without first redistributing the data comprises an architecture that provides C++ abstraction layers that include the algorithms over a SQL data store, and a higher Python abstraction layer that includes grouping and iteration controllers and call functionality to the C++ layer for invocation of the algorithms.
申请公布号 US9324036(B1) 申请公布日期 2016.04.26
申请号 US201313931876 申请日期 2013.06.29
申请人 EMC Corporation 发明人 Iyer Rahul;Qian Hai;Yang Shengwen;Welton Caleb E.
分类号 G06N99/00 主分类号 G06N99/00
代理机构 Van Pelt, Yi & James LLP 代理人 Van Pelt, Yi & James LLP
主权项 1. A method of analyzing data within a distributed database having a plurality of database segments, comprising: grouping, using a grouping process running within the database, instances of data into a one or more groups such that each group comprises data instances having one or more common attribute values that characterize the group, wherein the instances of data are grouped into the one or more groups without the data instances being redistributed; running a first iteration of an analytic algorithm within the database on each of the one or more groups to generate a predictive model for each group; running subsequent iterations of said analytic algorithm on each group using as an input model for each subsequent iteration for each group the predictive model for such group generated by a preceding iteration; and updating in said database said predictive model generated by the preceding iteration with results of said analytic algorithm generated by said subsequent iteration.
地址 Hopkinton CA US