主权项 |
1. A method, comprising:
embedding in each of a plurality of distributed processing segments a library or other shared object comprising one or more data analytical functions; receiving by a master node a data analysis request; creating by the master node a plan to generate a response to the request; assigning to each of the plurality of distributed processing segments a corresponding portion of the plan to be performed by that segment, including by invoking as indicated in the assignment one or more data analytical functions embedded in the processing segment; obtaining, by the master node, metadata associated with one or more portions of the plan to be performed by one or more corresponding segments, wherein the master node obtains the metadata from a central metadata store wherein the metadata identifies a location data corresponding to the one or more portions of the plan and at least a part of one or more data analytic processing to be performed in connection with processing the corresponding one or more portions of the plan; sending, by the master node to each of the plurality of distributed processing segments for which a portion of the plan is assigned, the corresponding portion of the plan to be performed by that segment and the metadata, wherein the metadata is used to locate or access a subset of data on which the segment is to perform an indicated processing; receiving, from each of the plurality of distributed processing segments for which a portion of the plan is assigned, a corresponding result of processing the portion of the plan; and generating, a master response to the data analysis request based at least in part on the corresponding result of processing the portion of the plan received from each of the plurality of distributed processing segments for which a portion of the plan is assigned. |