摘要 |
A dictionary generation device comprises: a model generation unit which generates a vocabulary segmenting model, using a prepared corpus and vocabulary group; an analysis unit which, on a collected text set, executes a vocabulary segmenting in which the vocabulary segmenting model is embedded and appends boundary information to each unit of text; a selection unit which selects vocabulary to be logged in a dictionary from the text to which the boundary information is appended by the analysis unit; and a logging unit which logs in the dictionary the vocabulary which is selected by the selection unit. The boundary information which denotes the vocabulary boundaries is appended to each of the units of text which is included in the corpus. |