发明名称 |
System and method for processing semi-structured business data using selected template designs |
摘要 |
A method for processing semi-structured data. The method includes receiving semi-structured data into a first format from a real business process. Preferably, the semi-structured data are machine generated. The method includes tokenizing the semi-structured data into a second format and storing the semi-structured data in the second format into one or more memories and clustering the tokenized data to form a plurality of clusters. The method also includes identifying a selected low frequency term in each of the clusters, and processing at least two of the clusters and the associated selected low frequency terms to form a single template for the at least two of the clusters. In a preferred embodiment, the method replaces the selected low frequency term with a wild card character.
|
申请公布号 |
US7389306(B2) |
申请公布日期 |
2008.06.17 |
申请号 |
US20040895624 |
申请日期 |
2004.07.20 |
申请人 |
ENKATA TECHNOLOGIES, INC. |
发明人 |
SCHUETZE HINRICH H.;YU CHIA-HAO;VELIPASAOGLU OMER EMRE;STUKOV STAN |
分类号 |
G06F17/00 |
主分类号 |
G06F17/00 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|