发明名称 Semantically aggregated index in an indexer-agnostic index building system
摘要 A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
申请公布号 US9104749(B2) 申请公布日期 2015.08.11
申请号 US201113005425 申请日期 2011.01.12
申请人 International Business Machines Corporation 发明人 Alba Alfredo;DeLuca Chad E;Ercegovac Vuk;Griffin Thomas D;Rao Jun;Singh Asim V;Wang Kevin B
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人 Holman Jeffrey T.
主权项 1. A computer program product for an indexer-agnostic index building system, comprising: a non-transitory computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index, the operations comprising: extracting documents from a data source, wherein each document includes a data object;distributing the documents to a plurality of processing nodes within the system, wherein distributing the documents to a plurality of nodes comprises distributing the documents to a mapper at each processing node;for each of the plurality of processing nodes, creating a shard fragment by indexing the data objects for each document into fields using semantic rules for identifying related data objects between the documents; andgrouping indexed data objects for related fields by: grouping the related documents into one of a plurality of logical groups based on the semantic rules; andcreating a full searchable index by: consolidating shard fragments from the plurality of processing nodes into index shards for each of the plurality of logical groups; andcombining the index shards into the full searchable index shard.
地址 Armonk NY US