摘要 |
PROBLEM TO BE SOLVED: To provide a program and the like for referencing a set of sentences of the same intent including a seed sentence and a group of many general sentences and automatically creating diverse similar sentences of the same intent.SOLUTION: A first seed word and a second seed word related to each other in a seed sentence are detected, and a synonym or quasi-synonym database is searched for synonyms or quasi-synonyms similar to the seed words. Then, a set of sentences of the same intent is referenced and, with each context word being as an element of a vector, a seed word co-occurrence vector consisting of the respective occurrence frequencies of the seed words is calculated. Next, a large number of general sentences are referenced and, with each context word being used as an element of a vector, a synonym or quasi-synonym co-occurrence vector consisting of the respective occurrence frequencies of context words relevant to the synonyms or quasi-synonyms is calculated. It is then compared with the seed word co-occurrence vector, and a synonym or quasi-synonym co-occurrence vector consisting of the respective occurrence frequencies of synonyms or quasi-synonyms having levels of similarity at or above a prescribed threshold of context words is selected. Finally, similar sentences in which the seed words and the synonyms or quasi-synonyms co-occur are created. |