主权项 |
1. A computer-implemented method for automatic cognate detection, the method comprising:
stemming, by a processor, a first word in a first language in a bilingual corpus to obtain a first stem and a second word in a second language in the bilingual corpus to obtain a second stem; calculating, by the processor, a probability for aligning the first stem and the second stem; normalizing, by the processor, the first stem and the second stem; calculating, by the processor, a distance metric between the normalized first stem and the normalized second stem; identifying, by the processor, the first word and the second word as a cognate pair when the probability and the distance metric meet a threshold criterion; storing the cognate pair in a set of cognates; retrieving, by the processor, a candidate sentence in the second language from a corpus; filtering, by the processor, the candidate sentence by an active vocabulary of a user in the second language and the set of cognates; calculating, by the processor, a sentence quality score for the candidate sentence; ranking, by the processor, the candidate sentence based on the sentence quality score; and presenting the ranked candidate sentence as a pure or combined audio, graphic, textual, or video stimulus to the user. |