发明名称 Methods and apparatus for effective on-line backup selection for failure recovery in distributed stream processing systems
摘要 A failure recovery framework to be used in cooperative data stream processing is provided that can be used in a large-scale stream data analysis environment. Failure recovery supports a plurality of independent distributed sites, each having its own local administration and goals. The distributed sites cooperate in an inter-site back-up mechanism to provide for system recovery from a variety of failures within the system. Failure recovery is both automatic and timely through cooperation among sites. Back-up sites associated with a given primary site are identified. These sites are used to identify failures within the primary site including failures of applications running on the nodes of the primary site. The failed applications are reinstated on one or more nodes within the back-up sites using job management instances local to the back-up sites in combination with previously stored state information and data values for the failed applications. In additions to inter-site mechanisms, each one of the plurality of sites employs an intra-site back-up mechanism to handle failure recoveries within the site.
申请公布号 US8225129(B2) 申请公布日期 2012.07.17
申请号 US20070733732 申请日期 2007.04.10
申请人 DOUGLIS FREDERICK;LIU ZHEN;XIA HONGHUI;RONG BIN;INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 DOUGLIS FREDERICK;LIU ZHEN;XIA HONGHUI;RONG BIN
分类号 G06F11/00 主分类号 G06F11/00
代理机构 代理人
主权项
地址