摘要 |
PURPOSE: A method for extracting document structure information is provided to extract DTD(Document Type Definition) from XML(eXtensible Markup Language) which is not provided DTD, to generate DTD in a short time and efficiently. CONSTITUTION: A root tree node is generated(S1). A current path and insert pointer is generated, and initiates the current path and insert pointer(S2). An element is extracted by an XML Tokenizer at a tree constructing step. Tags in the XML document are analyzed, and extract start tag, end tag and PCDATA token(S3). Existence of same named element is checked(S4). In case that the same named tree node is existed, the new node is added(S5-1). S3 or S5 is executed for all elements(S6).
|