摘要 |
<p><P>PROBLEM TO BE SOLVED: To highly precisely extract unknown words for extracting even words configured of versatile character types inexpensively. <P>SOLUTION: When query log is input, unknown candidates are extracted from a retrieval keyword group as the group of different retrieval keywords obtained by dividing one query of the query log by spaces, and unknown word-likeness is determined from unknown word candidates, and unknown words are extracted, and registered in a reference dictionary. When unknown words are extracted, words used like a burst, which are not registered in the reference dictionary, which has at most the prescribed number of characters, which have been used by at least the prescribed number of users, and which has a time division whose appearance frequency is larger than the appearance frequency of the other time division are extracted as unknown words. <P>COPYRIGHT: (C)2010,JPO&INPIT</p> |