发明名称 Configuring a system to collect and aggregate datasets
摘要 Methods for configuring a system to collect and aggregate datasets are disclosed. One embodiment includes, identifying a data source in the system from where dataset is to be collected, configuring a machine in the system that generates the dataset to be collected, to send the dataset to the data source, identifying an arrival location where the dataset that is collected is to be aggregated or written, and/or configuring an agent node by specifying a source for the agent node as the data source in the system and specifying a sink for the agent node as the arrival location.
申请公布号 US9317572(B2) 申请公布日期 2016.04.19
申请号 US201012877902 申请日期 2010.09.08
申请人 Cloudera, Inc. 发明人 Hsieh Jonathan Ming-Cyn;Robinson Henry Noel
分类号 G06F17/30;G06F11/20;G06F11/34 主分类号 G06F17/30
代理机构 Perkins Coie LLP 代理人 Perkins Coie LLP
主权项 1. A method for configuring a system to collect and aggregate datasets, wherein the system comprises agent nodes to collect the datasets, collector nodes to receive the datasets from the agent nodes, and master nodes configured to dynamically change topology among the nodes in the system, the method being executed by a master node and comprising: identifying a data source in the system from which a dataset is to be collected; configuring a machine in the system that generates the dataset to send the dataset to the data source; identifying an arrival location where the dataset is to be aggregated or written; dynamically configuring, based on system changes, an agent node in an agent tier by: specifying a source for the agent node as the identified data source in the system; specifying a sink for the agent node as a collector source of a collector node in a collector tier; and dynamically configuring, based on the system changes, a collector node in a collector tier by: specifying the collector source of the collector node in the collector tier as the identified arrival location; andspecifying a collector sink of the collector node in the collector tier as a distributed file system;wherein the distributed file system is in a storage tier; wherein the agent node and collector node function as peers in a peer-to-peer network.
地址 Palo Alto CA US