发明名称 Automatically Mining Patterns For Rule Based Data Standardization Systems
摘要 Computer program products and systems are provided for mining for sub-patterns within a text data set. The embodiments facilitate finding a set of N frequently occurring sub-patterns within the data set, extracting the N sub-patterns from the data set, and clustering the extracted sub-patterns into K groups, where each extracted sub-pattern is placed within the same group with other extracted sub-patterns based upon a distance value D that determines a degree of similarity between the sub-pattern and every other sub-pattern within the same group.
申请公布号 US2013238610(A1) 申请公布日期 2013.09.12
申请号 US201213414374 申请日期 2012.03.07
申请人 CHATURVEDI SNIGDHA;FARUQUIE TANVEER A.;KARANAM HIMA P.;MENDELSSOHN MARVIN;MOHANIA MUKESH K.;SUBRAMANIAM L. VENKATA;INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 CHATURVEDI SNIGDHA;FARUQUIE TANVEER A.;KARANAM HIMA P.;MENDELSSOHN MARVIN;MOHANIA MUKESH K.;SUBRAMANIAM L. VENKATA
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址