发明名称 Architecture for an indexer
摘要 Disclosed is a technique for indexing data. For each token in a set of documents, a sort key is generated that includes a document identifier that indicates whether a section of a document associated with the sort key is an anchor text section or a context section, wherein the anchor text section and the context text section have a same document identifier; it is determined whether a data field associated with the token is a fixed width; when the data field is a fixed width, the token is designated as one for which fixed width sort is to be performed; and, when the data field is a variable length, the token is designated as one for which a variable width sort is to be performed. The fixed width sort and the variable width sort are performed. For each document, the sort keys are used to bring together the anchor text section and the context section of that document.
申请公布号 US7743060(B2) 申请公布日期 2010.06.22
申请号 US20070834556 申请日期 2007.08.06
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 FONTOURA MARCUS FELIPE;NEUMANN ANDREAS;RAJAGOPALAN SRIDHAR;SHEKITA EUGENE J.;ZIEN JASON YEONG
分类号 G06F7/00;G06F17/00;G06F17/30 主分类号 G06F7/00
代理机构 代理人
主权项
地址