发明名称 System and method for shared execution of mixed data flows
摘要 A method, computer program product, and computer system for shared execution of mixed data flows, performed by one or more computing devices, comprises identifying one or more resource sharing opportunities across a plurality of parallel tasks. The plurality of parallel tasks includes zero or more relational operations and at least one non-relational operation. The plurality of parallel tasks relative to the relational operations and the at least one non-relational operation are executed. In response to executing the plurality of parallel tasks, one or more resources of the identified resource sharing opportunities is shared across the relational operations and the at least one non-relational operation.
申请公布号 US8984515(B2) 申请公布日期 2015.03.17
申请号 US201213484959 申请日期 2012.05.31
申请人 International Business Machines Corporation 发明人 Gupta Rajeev;Ravindra Padmashree;Roy Prasan
分类号 G06F9/46;G06F7/00;G06F9/50 主分类号 G06F9/46
代理机构 Holland & Knight LLP 代理人 Holland & Knight LLP ;Colandreo, Esq. Brian J.;Placker, Esq. Jeffrey T.
主权项 1. A computer program product residing on a non-transitory computer readable medium having a plurality of instructions stored thereon which, when executed by a processor, cause the processor to perform operations comprising: identifying one or more resource sharing opportunities across a plurality of parallel tasks, wherein the plurality of parallel tasks includes one or more relational operations and at least one non-relational operation; executing the plurality of parallel tasks involving the one or more relational operations and the at least one non-relational operation; sharing, in response to executing the plurality of parallel tasks, one or more resources of the identified resource sharing opportunities across tasks involving both the one or more relational operations and at least one non-relational operation, and wherein at least one non-relational operation includes a clustering operation; and designating a task of the plurality of parallel tasks as a primary task, wherein a cluster-id of the primary task is a map output key of a merged task, and wherein in the merged task, cluster-ids of other tasks are part of map output values.
地址 Armonk NY US