发明名称 ENHANCED HADOOP FRAMEWORK FOR BIG-DATA APPLICATIONS
摘要 Described herein are a server and a method for processing Big-Data. The server receives source data that is uploaded to processing nodes. The server maintains a data structure corresponding to a plurality of previously submitted jobs to the server, the data structure including at least one a job identifier, at least one sequence of text associated with the at least one job identifier, and a list of processing nodes associated with the at least one sequence of text. The server receives a subsequent job including a job name from a client node and determines whether the job name matches the job identifier. The server allocates based on the determination, only the list of processing nodes corresponding to the matched identifier to the subsequent job and further updates the data structure.
申请公布号 US2016321310(A1) 申请公布日期 2016.11.03
申请号 US201615141101 申请日期 2016.04.28
申请人 ALSHAMMARI Hamoud 发明人 ALSHAMMARI Hamoud
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A server comprising: circuitry configured to receive source data that is to be uploaded to a plurality of processing nodes,partition the received source data into a plurality of data-blocks, each data block having a fixed size, and being replicated a predetermined number of times,upload the partitioned and replicated data-blocks to the processing nodes, each replicated data block being stored in a unique processing node,maintain a first data structure corresponding to a plurality of previously submitted jobs to the server, the first data structure including at least one a job identifier, at least one sequence of text associated with the at least one job identifier, and a list of processing nodes associated with the at least one sequence of text, each sequence of text stored in the first data structure having a length that is based on a likelihood of occurrence of the sequence of text,receive a subsequent job including a job name from a client node;determine whether the job name matches the job identifier stored in the first data structure,allocate based on the determination, only the list of processing nodes corresponding to the matched identifier to the subsequent job, andupdate the first data structure by computing for each sequence of text a lifespan parameter, the lifespan parameter for each sequence of text being computed based on a reuse factor of the sequence of text.
地址 Milford CT US