发明名称 Query-by-example in large-scale code repositories
摘要 Systems and methods for performing query-by-example are described. A query module executing on the system may maintain a source code repository containing a plurality of source code files. Each of the plurality of source code files is associated with a corresponding source syntax structure generated based on said each of the plurality of source code files. The query module may receive a query snippet, and generate a query syntax structure based on the query snippet. The query module may then identify a first source code file from the plurality of source code files for being relevant to the query snippet. The being relevant to the query snippet is determined by a first relevance score which is calculated based on the query syntax structure and the first source code file's corresponding source syntax structure.
申请公布号 US9317260(B2) 申请公布日期 2016.04.19
申请号 US201313962980 申请日期 2013.08.09
申请人 VMware, Inc. 发明人 Balachandran Vipin
分类号 G06F7/00;G06F17/30;G06F9/44 主分类号 G06F7/00
代理机构 代理人
主权项 1. A system configured to perform query-by-example, the system comprising a processor and a memory coupled with the processor, wherein the memory is configured to provide the processor with instructions for: maintaining, by a query module executing on the system, a source code repository containing a plurality of source code files, wherein each of the plurality of source code files is associated with a corresponding source syntax structure generated based on said each of the plurality of source code files and representative of a syntactic structure of said each of the plurality of source code files; receiving, by the query module, a query snippet; generating, by the query module, a query syntax structure based on the query snippet, wherein the query syntax structure represents a syntactic structure of the query snippet; and identifying, by the query module, a first source code file from the plurality of source code files for being relevant to the query snippet by: extracting a source sub-structure from a first syntax structure associated with the first source code file for matching with a query sub-structure extracted from the query syntax structure, identifying a matching pattern contained in the source sub-structure and in the query sub-structure, calculating a matching ratio based on the matching pattern, the source sub-structure's size, and the query sub-structure's size, and assigning the matching ratio as a first similarity score upon a second determination that the matching ratio is above a predetermined matching threshold, and identifying the first source code file upon a first determination that the first similarity score is above a predetermined similarity threshold, wherein the being relevant to the query snippet is determined by a first relevance score which is calculated based on the query syntax structure and the first source code file's corresponding source syntax structure.
地址 Palo Alto CA US