发明名称 Federated query engine for federation of data queries across structure and unstructured data
摘要 The subject technology provides querying structured and unstructured data across disparate incompatible systems with a single language and connection point. Cost based optimizations are provided for executing the query. In some configurations, logical plans for executing a query are generated. For each of the logical plans, the subject technology generates a set of physical plans for executing the query on data systems, determines an execution cost for each physical plan from the physical plans, and selects a respective physical plan with a lowest determined execution cost among the determined execution cost for each physical plan. A physical plan is then selected for execution with a lowest execution cost among the selected respective physical plans of each of the logical plans. Data from an operation from the query may then be persisted and then used for generating a new set of logical and physical plans for executing a remaining set of operations from the query.
申请公布号 US9330141(B2) 申请公布日期 2016.05.03
申请号 US201213631707 申请日期 2012.09.28
申请人 CIRRO, INC. 发明人 Salch David Herbert;Jew Brian Christopher;Theissen Mark Robert
分类号 G06F7/00;G06F17/30;G06F21/62 主分类号 G06F7/00
代理机构 McDermott Will & Emery LLP 代理人 McDermott Will & Emery LLP
主权项 1. A machine-implemented method, the method comprising: receiving a query for data stored across a plurality of data systems; generating a plurality of first logical plans for executing the query; for each of the plurality of first logical plans: generating a first set of physical plans for executing the query on the plurality of data systems based on the respective first logical plan;determining an execution cost for each physical plan from the first set of physical plans; andselecting the physical plan with a lowest determined execution cost from the first set of physical plans; selecting a physical plan with a lowest execution cost from among the respective physical plans selected for each of the plurality of first logical plans; executing an operation from the query according to the physical plan selected from among the respective physical plans selected for each of the plurality of first logical plans; updating the query based on results from the executed operation; generating a plurality of second logical plans for executing the updated query and a respective second set of physical plans for each of the plurality of second logical plans; determining an execution cost for each physical plan from the respective second sets of physical plans for the plurality of second logical plans; selecting a physical plan with a lowest determined execution cost among the respective second sets of physical plans for the plurality of second logical plans; executing an operation from the updated query according to the physical plan selected from among the respective second sets of physical plans; updating the updated query based on results from the executed operation from the updated query; and repeating the steps for generating a plurality of second logical plans and respective second sets of physical plans, determining an execution cost for each physical plan from the second sets of physical plans, selecting a physical plan with a lowest determined execution cost among the second sets of physical plans, executing an operation from the updated query, and updating the updated query until the updated query is complete.
地址 Aliso Viejo CA US