发明名称 Data conversion and search systems
摘要 <p>We describe a system for converting a plurality of collections of data to a structured searchable format, the system comprising: a set of data feeds, one from each of a plurality of data collection sources; temporary data storage coupled to said data feeds to store data from said collection sources for format-conversion processing; non-volatile storage storing processor control code for said format converting; a database storing a data format conversion table comprising format data and data in said structured searchable format; and a processor coupled to said non-volatile storage, to said database, to said temporary data storage, and to said non-volatile storage storing said code; wherein said code is configured to control said processor to: input a collection data item from a said data collection source; identify, within said data item, a set of fields defining: i) an identification number of said data item; ii) a type identification of said data item; and iii) at least one date for said data item; request, from said data format conversion table, format data defining potential formats for said collection data item, selected responsive to a content of a said field defining said identification number of said data item, said type identification of said data item, and said at least one date for said data item; determine a subset of said potential formats by testing a format of said identification number of said data item in said collection data item for compatibility with a format defined by each of said potential formats defined by said format data; and then for each format of said subset of potential formats: process an alphanumeric string in said collection data item comprising said identification number of said data item to attempt to extract from said string at least said identification number of said data item, a date defined by a year, and one or more alphabetical prefix or suffix letter to provide extracted item data, and test whether said identification number of said data item, said date defined by a year and said one or more alphabetical prefix or suffix letters are correctly extracted from said string, wherein said test of said correct extraction includes at least matching data in said extracted item data to fields in an output data format defining said structured, searchable format to determine whether a correct match is made; until a said correct match is found; and then assign said data in said extracted item data to said fields in said output data format defining said structured, searchable format; and store said extracted data from said collection data item in said database in said structured searchable format.</p>
申请公布号 EP2806363(A3) 申请公布日期 2015.06.03
申请号 EP20140169162 申请日期 2014.05.20
申请人 RWS GROUP LIMITED 发明人 PRICE, ALAN PETER
分类号 G06F17/22;G06F17/27 主分类号 G06F17/22
代理机构 代理人
主权项
地址