摘要 |
PURPOSE: To provide a document processor which can easily and efficiently detect only a content character string of document data without any complicated operation and process the character string even when only the content character string of document data is detected without considering the document structure, and cut and pasted. CONSTITUTION: The document processor, which handles a document structured so that document elements are nested, specifies a range optionally beyond the borders of the document elements (2, 11, 31, 33, and 34) and extracts a character string in the specified range (12 and 32). The range is specified with a start position 33 and an end position 34, which consist of a value indicating a character string element as the minimum unit of the document elements and a value representing the position of a character in the character string element; and character strings between the characters indicated by the start positions and end positions are read out of the character elements 31 and connected successively into one character string 32. |