主权项 |
1. A distributed database comprising:
a plurality of segment hosts each comprising one or more processors; and a master host comprising one or more processors, wherein: the master host is programmed to perform operations comprising:
submitting a map-reduce document as an input to a map-reduce program executing on the master host, the map-reducing program configured to cause operations specified in the map-reduce document to be executed in the distributed database system in parallel, wherein the map-reduce document includes comprises an input source and a map-reduce function definition, wherein:
the input source includes a query in Structured Query Language (SQL), andthe map-reduce function definition defines, in a scripting language that is different from SQL, a map function to be performed on the input source and a reduce function to be performed on results of the map function; and distributing, using the map-reduce program, the map function and reduce function to the segment hosts as tasks; and each of the segment hosts is programmed to perform the tasks, including executing, as SQL queries, both the map function and reduce function defined in the map-reduce function definition and the query of the input source. |