发明名称 |
Semantically aggregated index in an indexer-agnostic index building system |
摘要 |
A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups. |
申请公布号 |
US9104749(B2) |
申请公布日期 |
2015.08.11 |
申请号 |
US201113005425 |
申请日期 |
2011.01.12 |
申请人 |
International Business Machines Corporation |
发明人 |
Alba Alfredo;DeLuca Chad E;Ercegovac Vuk;Griffin Thomas D;Rao Jun;Singh Asim V;Wang Kevin B |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
Holman Jeffrey T. |
主权项 |
1. A computer program product for an indexer-agnostic index building system, comprising:
a non-transitory computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index, the operations comprising:
extracting documents from a data source, wherein each document includes a data object;distributing the documents to a plurality of processing nodes within the system, wherein distributing the documents to a plurality of nodes comprises distributing the documents to a mapper at each processing node;for each of the plurality of processing nodes, creating a shard fragment by indexing the data objects for each document into fields using semantic rules for identifying related data objects between the documents; andgrouping indexed data objects for related fields by:
grouping the related documents into one of a plurality of logical groups based on the semantic rules; andcreating a full searchable index by:
consolidating shard fragments from the plurality of processing nodes into index shards for each of the plurality of logical groups; andcombining the index shards into the full searchable index shard. |
地址 |
Armonk NY US |