主权项 |
1. A computer-implemented method of classifying a spoken response as being plagiarized or non-plagiarized, the method comprising:
processing a spoken response with a processing system to generate a first text that is representative of the spoken response; processing the first text with the processing system to remove disfluencies in the first text; processing the first text with the processing system to identify a plurality of n-grams in the first text; processing the first text with the processing system to identify a plurality of sentences in the first text; processing the plurality of n-grams and a source text with the processing system to determine a first numerical measure indicative of a number of words and phrases of the first text that are included verbatim in the source text, each of the n-grams being compared to n-grams of the source text to determine the first numerical measure, the source text having been designated as a source of plagiarized content; processing the first text and the source text with the processing system to determine a second numerical measure indicative of (i) an amount of the first text that paraphrases portions of the source text, or (ii) an amount of the first text that is semantically-similar to portions of the source text, the second numerical measure being determined by comparing units of text of the first text with corresponding units of text of the source text; processing the plurality of sentences and the source text with the processing system to determine a third numerical measure indicative of a similarity between sentences of the first text and sentences of the source text, each sentence of the plurality of sentences being compared to each sentence of the source text to determine the third numerical measure; and applying a model to the first numerical measure, the second numerical measure, and the third numerical measure to classify the spoken response as being plagiarized or non-plagiarized, the model including
a first variable and an associated first weighting factor, the first variable receiving a value of the first numerical measure,a second variable and an associated second weighting factor, the second variable receiving a value of the second numerical measure, anda third variable and an associated third weighting factor, the third variable receiving a value of the third numerical measure. |