发明名称 |
Automatically Mining Patterns For Rule Based Data Standardization Systems |
摘要 |
Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.
|
申请公布号 |
US2013238610(A1) |
申请公布日期 |
2013.09.12 |
申请号 |
US201213414374 |
申请日期 |
2012.03.07 |
申请人 |
CHATURVEDI SNIGDHA;FARUQUIE TANVEER A.;KARANAM HIMA P.;MENDELSSOHN MARVIN;MOHANIA MUKESH K.;SUBRAMANIAM L. VENKATA;INTERNATIONAL BUSINESS MACHINES CORPORATION |
发明人 |
CHATURVEDI SNIGDHA;FARUQUIE TANVEER A.;KARANAM HIMA P.;MENDELSSOHN MARVIN;MOHANIA MUKESH K.;SUBRAMANIAM L. VENKATA |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|