发明名称 Automated document revision markup and change control
摘要 Automated comparison of Darwin Information Typing Architecture (DITA) documents for revision mark-up includes reading document data from first and second DITA documents into respective document object model trees of nodes, and identifying and collapsing emphasis subtree nodes in the trees into their parent nodes, the collapsing caching emphasis data from the identified subtree nodes. A traversal transforms the model trees into respective node lists and captures adjacent sibling emphasis subtree nodes as single text nodes. The node lists are merged into a merged node list that recognizes matches node pairs having primary sort key information and document structure metadata meeting a match threshold, with differences between matching tokens of the node pairs saved. A merged document object model built from the refined merged node list is transformed into a hypertext mark-up language document.
申请公布号 US9208136(B2) 申请公布日期 2015.12.08
申请号 US201313736137 申请日期 2013.01.08
申请人 International Business Machines Corporation 发明人 Fischer Stephen E.
分类号 G06F17/00;G06F17/22;G06F17/21 主分类号 G06F17/00
代理机构 Driggs, Hogg, Daugherty & Del Zoppo Co., LPA 代理人 Daugherty Patrick J.;Driggs, Hogg, Daugherty & Del Zoppo Co., LPA
主权项 1. A method for automated comparison of Darwin Information Typing Architecture (DITA) documents for revision mark-up, the method comprising: reading via a processing unit document data from a first DITA document into a first document object model tree comprising a plurality of nodes, and from a second DITA document into a second document object model tree comprising a plurality of nodes; identifying and collapsing via the processing unit emphasis subtree nodes in the first document object model tree into their parent nodes in the first document object model tree, and emphasis subtree nodes in the second document object model tree into their parent nodes in the second document object model tree, the collapsing comprising caching emphasis data from the identified subtree nodes; the processing unit transforming the first document object model tree into a first node list, and the second document object model tree into a second node list, the listed nodes each comprising primary sort key information and document structure metadata; merging via the processing unit the first and second node lists into a merged node list by recognizing matches of node pairs from each list that have primary sort key information and document structure metadata meeting a threshold percentage of match, and that saves differences between matching tokens of the node pairs; separating out table segments from the merged node list into a table node list and a nontable node list; recovering the cached emphasis data for the table segments in the table node list; building a merged document object model from the merged node list and the non-table node list and the recovered cached emphasis data for the table segments in the table node list; and transforming the built merged document object model into a hypertext mark-up language document that displays the saved differences between the matching tokens as word-level highlighting mark-ups within the refined tables.
地址 Armonk NY US