摘要 |
<p><P>PROBLEM TO BE SOLVED: To provide a document analysis device which analyzes a document containing a character string which is not normal language expression such as a pictograph string. <P>SOLUTION: The document analysis device 8 which analyzes input documents containing a character string which is not a normal language expression, includes: a document analysis control part 1 which controls the whole document analysis; a document shaping part 2 which includes a deleting part 21 of pictograph string etc. which deletes the pictograph string etc. from the input document, and a part of speech setting part 22 for setting the part of speech of the pictograph string, etc. from the input document to any independent word of noun, verb, adjective, adjective verb, and adverb; a morphological analysis part 3 which divides the input document shaped in the document shaping part 2 into words, adds grammar information to the divided words, and gives a likelihood to the morphological analysis result; an analysis success or failure determination part 4 for determining success or failure of the analysis result; and an analysis result output part 5 which adds the position information or part of speech information of such as pictograph string to the analysis result and outputs position to the analysis result. <P>COPYRIGHT: (C)2011,JPO&INPIT</p> |