发明名称 Gathering index statistics using sampling
摘要 An approach is provided in which a sample point system allocates sample point identifiers to a root node included an index tree that includes multiple leaf nodes. The sample point system distributes the sample point identifiers to the root node's child nodes, and recursively traverses through the index tree's hierarchical index levels and distributes the sample point identifiers from the child nodes to a subset of the index tree's leaf nodes. In turn, the sample point system collects sample data from the subset of the plurality of leaf nodes corresponding to the distributed sample point identifiers.
申请公布号 US9189518(B2) 申请公布日期 2015.11.17
申请号 US201213656355 申请日期 2012.10.19
申请人 International Business Machines Corporation 发明人 Lashley Scott D.;Miao Bingjie
分类号 G06F17/30 主分类号 G06F17/30
代理机构 VanLeeuwen & VanLeeuwen 代理人 VanLeeuwen & VanLeeuwen ;Kashef Mohammed Y.
主权项 1. An information handling system comprising: one or more processors; a memory coupled to at least one of the one or more processors; one or more non-volatile storage areas coupled to at least one of the one or more processors; a set of computer program instructions stored in the memory and executed by at least one of the processors in order to perform actions of: allocating a plurality of sample point identifiers to a root node included in an index tree corresponding to a database stored in one of the non-volatile storage areas, the index tree including a plurality of leaf nodes;identifying, by at least one of the one or more processors, a number of first nodes included in a plurality of first nodes that are child nodes of the root node;computing, by at least one of the one or more processors, a distribution average based upon the number of first nodes and a number of sample point identifiers included in the plurality of sample point identifiers;distributing the plurality of sample point identifiers to the plurality of first nodes, wherein an amount of the plurality of sample point identifiers distributed to each of the plurality of first nodes does not exceed the distribution average;recursively traversing through a plurality of hierarchical index levels included in the index tree and distributing the plurality of sample point identifiers from the plurality of first nodes to a subset of the plurality of leaf nodes;collecting sample data from the subset of the plurality of leaf nodes corresponding to the distributed plurality of sample point identifiers; andgenerating a query plan corresponding to the database based upon the collected sample data.
地址 Armonk NY US