摘要 |
A method for plagiarism detection of multithreaded program based on a thread slice birthmark includes steps of: 1) monitoring target programs during executing, real-time identifying system call, and recording related information comprising thread IDs, system call numbers, and return values; then pre-treating the information for obtaining a valid system call sequence Trace; 2) slicing the valid system call sequence Trace, for generating a series of thread slices Slice identified by the thread IDs; 3) generating dynamic thread slice birthmarks Birth of all the thread slices of two programs; 4) respectively generating corresponding software birthmarks PB1 and PB2 of the P1 and the P2; 5) matching based on a max bilateral diagram for calculating a max similarity between the software birthmarks PB1 and PB2; and 6) determines whether the program is plagiarized or not according to an average value of the birthmark similarity and a given threshold ε. |
主权项 |
1. A method for plagiarism detection of multithreaded program based on a thread slice birthmark, comprising steps of:
1) monitoring target programs during executing based on a dynamic instrumentation technology, real-time identifying system calls of the target programs, and recording related information comprising thread IDs, calling addresses, system call numbers, function names, parameters, and return values; then pre-treating the information, and removing invalid system call for obtaining a valid system call sequence Trace; 2) slicing the valid system call sequence Trace based on the thread IDs, for generating a series of thread slices Slice identified by the thread IDs; 3) based on the thread slices and through extracting a fixed sub-sequence of the thread slices and counting an occurrence number thereof, respectively generating dynamic thread slice birthmarks Birth of all the thread slices of a first target program P1 and a second target program P2, wherein the first target program is an original program of a program owner, and the second program is a suspicious program which is considered as a plagiarized program; 4) respectively generating corresponding software birthmarks PB1 and PB2 of the P1 and the P2 based on all the thread slices thereof; 5) matching based on a max bilateral diagram for calculating a max similarity between the software birthmarks PB1 and PB2; firstly calculating similarities between each thread slice birthmark of the software birthmark PB1 and each thread slice birthmark of the software birthmark PB2; secondly generating a max similarity matching scheme MaxMatch(PB1,PB2) of the PB1 and the PB2 based on a weighted bilateral diagram matching algorithm; finally calculating a birthmark similarity Sim(PB1,PB2) of the PB1 and the PB2 based on the max similarity matching scheme; and 6) determining plagiarism according to a birthmark similarity average value after several inputting as well as a given threshold. |