发明名称 METHOD AND SYSTEM FOR CHECKPOINTING A GLOBAL STATE OF A DISTRIBUTED SYSTEM
摘要 A method for check pointing a global state of a distributed system with one or more distributed applications organized in a directed acyclic graph topology includes, upon receiving a marker in an active input channel of a first task application, putting an active input channel on hold, performing check pointing by saving an internal state of the first task application when all input channels have received a marker and are put on hold, forwarding the marker via all output channels of the first task application to at least one other task application of the one or more task applications, and reactivating all input channels of the first task application, wherein the global state is a union of all internal states of the task applications after each of the one or more task applications has been check pointed.
申请公布号 US2016179627(A1) 申请公布日期 2016.06.23
申请号 US201314908131 申请日期 2013.07.30
申请人 NEC EUROPE LTD. 发明人 Dusi Maurizio;Fiori Luca;Gringoli Francesco
分类号 G06F11/14 主分类号 G06F11/14
代理机构 代理人
主权项 1. A method car check pointing a global state of a distributed system with one or more distributed applications organized in a directed acyclic graph topology, wherein one or more source applications provide data to one or more task applications each having one or more input channels and one or more output channels for exchanging processed data with others of the one or more task applications, wherein at least one of the one or more task applications processes data received on its input channels sends processed data out on one or more of its output channels to at least one other of the one or more task applications, and wherein one or more destinations collect processed data, the method comprising: a) upon receiving a marker in an active input channel of a first task application, putting the input channel on hold, b) performing check pointing by saving an internal state of the first task application when all input channels have received a marker and are on hold, c) forwarding the marker via all output channels of the first task application to at least one other task application of the one or more task applications, and d) reactivating the input channels of the first task application, wherein the global state is a union of all internal states of the task applications after each of the one or more task applications has been check pointed.
地址 Heidelberg DE