发明名称 DISTRIBUTED, MULTI-MODEL, SELF-LEARNING PLATFORM FOR MACHINE LEARNING
摘要 A system is provided for multi-methodology, multi-user, self-optimizing Machine Learning as a Service for that automates and optimizes the model training process. The system uses a large-scale distributed architecture and is compatible with cloud services. The system uses a hybrid optimization technique to select between multiple machine learning approaches for a given dataset. The system can also use datasets to transferring knowledge of how one modeling methodology has previously worked over to a new problem.
申请公布号 US2016132787(A1) 申请公布日期 2016.05.12
申请号 US201514598628 申请日期 2015.01.16
申请人 Drevo Will D.;Veeramachaneni Kalyan K.;O'Reilly Una-May 发明人 Drevo Will D.;Veeramachaneni Kalyan K.;O'Reilly Una-May
分类号 G06N99/00 主分类号 G06N99/00
代理机构 代理人
主权项 1. A system to automate selection and training of machine learning models across multiple modeling methodologies, the system comprising: a model methodology repository configured to store one or more model methodology implementations, each of the model methodology implementations associated with a modeling methodology; a dataset repository configured to store datasets; a data hub configured to store data run records and performance records; a dataset upload interface (UI) configured to receive a dataset, store the received dataset within the dataset repository, to generate a data run record comprising the location of received dataset within the dataset repository, and to store the generated data run record to the data hub; and a processing cluster comprising a plurality of worker nodes, each of the worker nodes configured to select a data run record from the data hub, to select a dataset from the dataset repository, to select a modeling methodology from the model methodology repository; to generate a parameterization within with the model methodology, to generate a model having the selected modeling methodology and generated parameterization, to train the generated model on the selected dataset, to evaluate the performance of the trained model on the selected dataset, to generate a performance record, and to store the generated performance record to the data hub.
地址 Cambridge MA US