摘要 |
<p>A seed string is inputted into an input unit (101). A search unit (102) acquires document snippets containing said seed string. A segment-acquisition unit (103) acquires segments by segmenting said snippets at a segment-delimiter string. A segment-element acquisition unit (104) acquires segment elements by segmenting the segments at a segment-element delimiter string. A segment-score calculation unit (105) uses the standard deviation of the lengths of the segment elements to calculate a score for each segment. A segment-element-score calculation unit (106) uses the segment scores and distances between the positions of the seed string and the positions of the segment elements to calculate a score for each segment element. On the basis of said segment-element scores, a selection unit (107) selects one of the segment elements as a candidate instance in an expanded set for the seed string.</p> |