发明名称 MULTISOURCE SEMANTIC PARTITIONING
摘要 Methods, systems, and computer program products for processing a query to determine query results. The query may be analyzed to determine a constant column pair corresponding to the query. The column constant pair may be analyzed with respect to a column constant pair associated with a partitioned data set in order to route the query to a subset of the data set. Data sets may be partitioned into subsets by analyzing historical queries to determine a partitioning column constant pair with respect to the data set that is used to partition the data of the data set into subsets. The query processing may include both query routing and data set partitioning.
申请公布号 US2016147837(A1) 申请公布日期 2016.05.26
申请号 US201414550166 申请日期 2014.11.21
申请人 Red Hat, Inc. 发明人 Nguyen Filip;Elias Filip
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A computer-implemented method for federated query processing comprising: receiving one or more source queries associated with a data set; storing the one or more source queries as one or more historical queries; determining one or more column constant pairs associated with the one or more historical queries; based on the one or more column constant pairs, determining a partitioning column constant pair; determining a first subset of the one or more column constant pairs that has a first pre-defined relation to the partitioning column constant pair; determining a second subset of the one or more column constant pairs that has a second pre-defined relation to the partitioning column constant pair; based on the partitioning column constant pair, partitioning the data set into a first subset of the data set and a second subset of the data set; receiving a source query; determining a source column constant pair associated with the source query; comparing the source column constant pair to the partitioning column constant pair; and based on the comparing, generating a result of the source query from at least one of the following: a view, the first subset of the data set, and the second subset of the data set.
地址 Raleigh NC US