发明名称 Profiling data with location information
摘要 Profiling data includes processing an accessed collection of records, including: generating, for a first set of distinct values appearing in a first set of one or more fields, corresponding location information; generating, for the first set of fields, a corresponding list of entries identifying a distinct value from the first set of distinct values and the location information for the distinct value; generating, for a second set of one or more fields, a corresponding list of entries, with each entry identifying a distinct value from a second set of distinct values appearing in the second set of fields; and generating result information, based at least in part on: locating at least one record of the collection using the location information for at least one value appearing in the first set of fields, and determining at least one value appearing in the second set of fields of the located record.
申请公布号 US9141610(B2) 申请公布日期 2015.09.22
申请号 US201313958057 申请日期 2013.08.02
申请人 Ab Initio Technology LLC 发明人 Anderson Arlen
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Occhiuti & Rohlicek LLP 代理人 Occhiuti & Rohlicek LLP
主权项 1. A method for profiling data stored in at least one data storage system, the method including: accessing at least one collection of records stored in the data storage system over an interface coupled to the data storage system; and processing the collection of records to generate result information characterizing values appearing in one or more specified fields of the collection of records, the processing including: generating, for a first set of distinct values appearing in a first set of one or more fields of the records in the collection, corresponding location information that identifies, for each distinct value in the first set of distinct values, every record in which the distinct value appears,generating, for the first set of one or more fields, a corresponding list of entries, with each entry identifying a distinct value from the first set of distinct values, and the location information for the distinct value,generating, for a second set of one or more fields of the records in the collection different from the first set of one or more fields, a corresponding list of entries, with each entry identifying a distinct value from a second set of distinct values appearing in the second set of one or more fields, andgenerating the result information characterizing values appearing in the one or more specified fields of the collection of records, based at least in part on: locating at least one record of the collection of records using the location information for at least one value appearing in the first set of one or more fields, and determining at least one value appearing in the second set of one or more fields of the located record.
地址 Lexington MA US