发明名称 Mechanism for recovery from site failure in a stream processing system
摘要 A failure recovery framework to be used in cooperative data stream processing is provided that can be used in a large-scale stream data analysis environment. Failure recovery supports a plurality of independent distributed sites, each having its own local administration and goals. The distributed sites cooperate in an inter-site back-up mechanism to provide for system recovery from a variety of failures within the system. Failure recovery is both automatic and timely through cooperation among sites. Back-up sites associated with a given primary site are identified. These sites are used to identify failures within the primary site including failures of applications running on the nodes of the primary site. The failed applications are reinstated on one or more nodes within the back-up sites using job management instances local to the back-up sites in combination with previously stored state information and data values for the failed applications. In additions to inter-site mechanisms, each one of the plurality of sites employs an intra-site back-up mechanism to handle failure recoveries within the site.
申请公布号 US8219848(B2) 申请公布日期 2012.07.10
申请号 US20070733724 申请日期 2007.04.10
申请人 BRANSON MICHAEL JOHN;DOUGLIS FREDERICK;FAWCETT BRADLEY WILLIAM;LIU ZHEN;RONG BIN;YE FAN;INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 BRANSON MICHAEL JOHN;DOUGLIS FREDERICK;FAWCETT BRADLEY WILLIAM;LIU ZHEN;RONG BIN;YE FAN
分类号 G06F11/00 主分类号 G06F11/00
代理机构 代理人
主权项
地址