发明名称 System and method for processing semi-structured business data using selected template designs
摘要 A method for processing semi-structured data. The method includes receiving semi-structured data into a first format from a real business process. Preferably, the semi-structured data are machine generated. The method includes tokenizing the semi-structured data into a second format and storing the semi-structured data in the second format into one or more memories and clustering the tokenized data to form a plurality of clusters. The method also includes identifying a selected low frequency term in each of the clusters, and processing at least two of the clusters and the associated selected low frequency terms to form a single template for the at least two of the clusters. In a preferred embodiment, the method replaces the selected low frequency term with a wild card character.
申请公布号 US7389306(B2) 申请公布日期 2008.06.17
申请号 US20040895624 申请日期 2004.07.20
申请人 ENKATA TECHNOLOGIES, INC. 发明人 SCHUETZE HINRICH H.;YU CHIA-HAO;VELIPASAOGLU OMER EMRE;STUKOV STAN
分类号 G06F17/00 主分类号 G06F17/00
代理机构 代理人
主权项
地址