发明名称 Systems and methods for efficient data ingestion and query processing
摘要 A query may be provided to aggregators at hierarchical levels in an in-memory data storage module. The query may be provided to leaf nodes of the in-memory data storage module. The leaf nodes may execute the query, returning results of the query to the aggregators. One or more aggregations may be performed based on the results. In an embodiment, log entries associated with a logged event may be serialized and divided into distributed chunks for storage in the leaf nodes. A leaf node, from the leaf nodes, having storage capacity for a distributed chunk may be identified. The distributed chunk may be stored in the leaf node.
申请公布号 US9442967(B2) 申请公布日期 2016.09.13
申请号 US201313951431 申请日期 2013.07.25
申请人 Facebook, Inc. 发明人 Barykin Oleksandr;Metzler Josh
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Sheppard Mullin Richter & Hampton LLP 代理人 Sheppard Mullin Richter & Hampton LLP
主权项 1. A system comprising: at least one processor; and a memory storing instructions configured to instruct the at least one processor to perform: serializing log entries associated with at least one logged event;dividing the serialized log entries into one or more distributed chunks for storage in one or more leaf nodes of an in-memory data storage module, wherein storage of at least one of the distributed chunks is striped across at least two randomly selected leaf nodes, and wherein a corresponding space limit of each of the one or more leaf nodes is adjusted based at least in part on a type of data being stored at the leaf node;providing a query to aggregators at hierarchical levels in the in-memory data storage module, wherein the aggregators are configured to pre-aggregate at least some data stored in the one or more leaf nodes of the in-memory data storage module in anticipation of the query;providing the query to leaf nodes of the in-memory data storage module;executing the query on the leaf nodes;returning results of the query to the aggregators;performing one or more aggregations on the results of the query; andupdating a query cache that corresponds to the query to include data describing the results.
地址 Menlo Park CA US