摘要 |
According to an example, source code flow analysis may include receiving source code for an application, and identifying virtual flow documents for the application from the source code. The virtual flow documents may represent ordered sequences of method calls for the application. The source code flow analysis may further include extracting features of the virtual flow documents, determining similarity between the virtual flow documents by estimating similarities for the extracted features to determine a flow-to-flow similarity, and clustering the virtual flow documents based on the flow-to-flow similarity. The flow-to-flow similarity may be further used, for example, to generate highest priority virtual flow documents and methods for the source code. The source code flow analysis may also include determination of flow-to-maintenance activity description (MAD) similarity, for example, to identify relevant virtual flow documents from the virtual flow documents based on the flow-to-MAD similarity to generate ordered relevant virtual flow documents. |
主权项 |
1. A source code flow analysis system comprising:
a memory storing machine readable instructions to:
receive source code for an application;identify virtual flow documents for the application from the source code, wherein the virtual flow documents represent ordered sequences of method calls for the application;extract textual features, points, and controls of the virtual flow documents by
extracting the textual features from method definitions in the virtual flow documents and arranging the extracted textual features as a co-occurrence vector,extracting, for the points, concept words from method names in the virtual flow documents, argument types in the virtual flow documents, and corresponding class names, andextracting, for the controls, concept words from annotation text of edges in the virtual flow documents;determine similarity between the virtual flow documents by estimating similarities for the extracted textual features, points, and controls to determine a flow-to-flow similarity by determining
a textual similarity by determining a cosine similarity of word co-occurrence vectors of the virtual flow documents,an intersection similarity by determining a number of intersection points divided by a length of union of the virtual flow documents,a point similarity by determining a set similarity between the points in the virtual flow documents, anda control similarity by determining a set similarity between the controls in the virtual flow documents; andcluster the virtual flow documents based on the flow-to-flow similarity to facilitate identification of a cause of a defect related to the application; and a processor to implement the machine readable instructions. |