FINDING PARTITION BOUNDARIES FOR PARALLEL PROCESSING OF MARKUP LANGUAGE DOCUMENTS
摘要
A method, a computer program product and a system identify partition locations within an extended markup language (XML) document without parsing so as to process portions of said document in parallel. The XML document includes sections required to remain continuous. The document is scanned for continuous sections without parsing, and boundaries of the initial partitions are adjusted to reside outside the continuous sections to determine resulting partitions for the document. The resulting partitions may be processed in parallel to provide the document information for storage.
申请公布号
WO2012041672(A1)
申请公布日期
2012.04.05
申请号
WO2011EP65482
申请日期
2011.09.07
申请人
INTERNATIONAL BUSINESS MACHINES CORPORATION;BAR-OR, AMIR;PADMANABHAN, SRIRAM;ERTEL, SEBASTIAN;BHIDE, MANISH, ANAND;AGARWAL, MANOJ