发明名称 HISTOGRAM CONSTRUCTION FOR STRING DATA
摘要 Methods and systems of generation of histograms for strings are described. In one implementation, a prefix tree having nodes representing prefixes of the strings is generated. For the prefix tree, deploy weights are assigned to the nodes based on lengths of the prefixes represented by sub-tree nodes rooted at the nodes and frequencies of the strings whose prefixes are represented by the sub-tree nodes. Each of the deploy weights of one node is indicative of a maximum weight preserved upon filling the buckets with at least one prefix represented by the sub-tree nodes rooted at that one node. A predefined number of Top-prefixes are determined for filling up the predefined number of buckets. The Top-prefixes are determined based on maximizing a total weight preserved by the prefixes in the buckets and over a maximum number of strings. A histogram is generated based on the deploy weights associated with the Top-prefixes.
申请公布号 WO2014176754(A1) 申请公布日期 2014.11.06
申请号 WO2013CN75033 申请日期 2013.04.30
申请人 HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.;LUO, GE;JIAO, LI-MEI;CAO, ZHAO;CHEN, SHIMIN;GUO, MENG 发明人 LUO, GE;JIAO, LI-MEI;CAO, ZHAO;CHEN, SHIMIN;GUO, MENG
分类号 G06F17/21;G06F9/45 主分类号 G06F17/21
代理机构 代理人
主权项
地址