发明名称 Streaming roll-ups on statistical data using a stack
摘要 A method for handling data includes receiving a data record. The method further includes adding the count value of the data record to a count value of the top row of a stack when the data record field values are included among the field values of the top row. The method also includes rolling up the top row when less than all of the data record field values are included among the top row field values and the count value of the top row is less than a threshold value. The method further includes outputting the top row and inserting the data record onto the stack as the new top row. The method may also include removing the top row and adding its count value to the count value of a new top row. A system for handling data includes a streaming data handler and a sorter.
申请公布号 US9384260(B1) 申请公布日期 2016.07.05
申请号 US201414510019 申请日期 2014.10.08
申请人 Google Inc. 发明人 Gupta Ashish;Jiang Haifeng
分类号 G06F15/16;G06F17/30 主分类号 G06F15/16
代理机构 Fish & Richardson P.C. 代理人 Fish & Richardson P.C.
主权项 1. A computer-implemented method for handling data comprising: (a) receiving an electronic data record having one or more fields and a count value; (b) adding, by one or more computer systems, the count value of the data record to a count value of the top row of a stack when all values of the one or more fields of the data record are included among one or more field values of the top row; (c) rolling up, by one or more computer systems, the top row when less than all of the one or more field values of the data record are included among the one or more field values of the top row and the count value of the top row is less than a threshold value; and removing the top row and adding the count value of the removed top row to the count value of the new top row when all of the one or more field values of the top row are included among the one or more field values of the row below the top row after roll-up; (d) outputting, by one or more computer systems, the top row when a top row of the stack is a super row of a last output row, wherein the top row is a super row of the last output row if non-wildcard field values of the top row remain the same as the field values of the last output row after one or more rollups of the last output row; and outputting the top row and inserting the data record onto the stack as the new top row when less than all of the one or more field values of the data record are included among the one or more field values of the top row and the count value of the top row is equal to or greater than the threshold value; and (e) reporting the output top row to a user as an indication of data record grouping.
地址 Mountain View CA US