发明名称 Intermediate data format for database population
摘要 An approach is provided that receives input from multiple data sources and transforms it into a common, intermediate format, where only one generic parser is required for the purpose of transformation into RDF, and the subsequent input to a triplestore database. A triplestore management tool provides this capability. The triplestore management tool includes a formatting component configured to receive data sources from a plurality of data source parsers, and transform each of the data sources into a single format. A parsing component parses each transformed data source at a common parser, and loads each of the transformed data sources from the common parser to a triplestore database.
申请公布号 US9471653(B2) 申请公布日期 2016.10.18
申请号 US201113282083 申请日期 2011.10.26
申请人 International Business Machines Corporation 发明人 Bostick James E.;Ganci, Jr. John M.;Kaemmerer John P.;Trim Craig M.
分类号 G06F7/00;G06F17/00;G06F17/30 主分类号 G06F7/00
代理机构 Keohane & D'Alessandro PLLC 代理人 Pivnichny John R.;Schiesser Madeline F.;Keohane & D'Alessandro PLLC
主权项 1. A method for triplestore database population, comprising: receiving a plurality of data sources parsed by plurality of data source parsers, wherein each of the plurality of data sources corresponds to each of the plurality of data source parsers according to a data type of each of the plurality of data sources; transforming each of the plurality of parsed data sources into a single intermediary format, the intermediary format not being associated with a database into which the data sources are to be stored; determining a triplestore database into which the data sources are to be stored; selecting a first shared parser based on compatibility with the triplestore database; receiving the intermediary format at the first shared parser, wherein the shared parser is configured to parse from the intermediary format; transforming the intermediary format into a generic, text-based file format at the first shared parser; loading each of the plurality of data sources from the shared parser into the triplestore database; replacing the triplestore database with a second triplestore database; selecting a second shared parser to receive the intermediary format, the second shared parser being selected for compatibility with the second triplestore database; and replacing the first shared parser with the second shared parser, wherein the second shared parser is configured to parse from the intermediary format to a format of the second triplestore database, and wherein the plurality of data source parsers are not replaced.
地址 Armonk NY US