摘要 |
A system and method is provided for performing code provenance review in a software due diligence system. In particular, performing code provenance review may include sub-dividing source code under review and third-party source into logical fragments using a language-independent text fracturing algorithm. For example, the fracturing algorithm may include a set of heuristic rules that account for variations in coding style to create logical fragments that are as large as possible without being independently copyrightable. Unique fingerprints may then be generated for the logical fragments using a fingerprint algorithm that features arithmetic computation. As such, potentially related source code may be identified if sub-dividing the source code under review and the third-party source code produces one or more logical fragments that have identical fingerprints.
|