发明名称 DISTRIBUTED ANALYSIS AND ATTRIBUTION OF SOURCE CODE
摘要 Methods, systems, and apparatus, including computer programs encoded on computer storage media, for distributing analysis tasks and attribution tasks. One of the methods includes receiving data representing a plurality of snapshots of a code base, wherein each snapshot comprises source code files, wherein one or more snapshots have a parent snapshot in the code base according to a revision graph of snapshots in the code base. An attribution set is generated from the plurality of snapshots, the attribution set having a target set of attributable snapshots to be attributed and a support set of all parent snapshots of all snapshots in the target set. An attribution task is distributed for the attribution set to a particular worker node of a plurality of worker nodes.
申请公布号 US2016140015(A1) 申请公布日期 2016.05.19
申请号 US201514940882 申请日期 2015.11.13
申请人 SEMMLE LIMITED 发明人 Baars Arthur;Henriksen Anders Starcke;Schaefer Max
分类号 G06F11/36 主分类号 G06F11/36
代理机构 代理人
主权项 1. A system comprising: a manager node and a plurality of worker nodes, wherein: the manager node is configured to perform operations comprising: receiving a request to perform attribution tasks on a plurality of snapshots of a code base, wherein performing an attribution task on a snapshot comprises attributing characteristic segments of source code in the snapshot to respective responsible entities,receiving data representing a revision graph, the revision graph representing parent and child relationships between snapshots of the code base, wherein a child snapshot is a subsequent snapshot of a parent snapshot in the code base,generating an attribution set having at most N snapshots of the revision graph, wherein N is a constant, the attribution set having a target subset of attributable snapshots and a support subset of parent snapshots of snapshots in the target subset, the snapshots in the support subset being snapshots that have one or more parent snapshots that do not occur in the target subset, andsubmitting an attribution task for the attribution set to one worker node of the plurality of worker nodes;and the plurality of worker nodes are each configured to perform operations comprising, for each attribution set provided to the worker node: copying, to the worker node for each snapshot in the attribution set, analysis data that identifies characteristic segments of source code in the snapshot; andattributing the characteristic segments of source code in each snapshot to a responsible entity.
地址 Oxford GB