摘要 |
PURPOSE:To improve efficiency for the extraction of a header, by finding a header paragraph candidate for a document data written with a bit of code information, from a header paragraph dictionary, and finding the range of the header through the use of a header paragraph regulation dictionary. CONSTITUTION:The document data written with the bit of code information is inputted from an input part. The header paragraph candidate for the document data is extracted from a header paragraph candidate extracting part 7 based on the content of a header paragraph word dictionary 8. Based on the contents of the header paragraph candidate, and a header paragraph regulation dictionary 10, a header range deciding part 9 decides a header range. A header deciding part 11 decides whether the header range is a header word or not on the basis of a regulation regarding a decision as the header word. A document structure deciding part 14 decides structures for a chapter and an item, etc., based on the contents of a decided header word, and a document structure regulation dictionary 15.
|