发明名称 SUGGESTING PATTERNS IN UNSTRUCTURED DOCUMENTS
摘要 A technique for suggesting patterns to search documents for information of interest includes acquiring a working set of spans for a document set that includes one or more documents. A list of one or more suggested patterns is generated by applying a pattern suggestion algorithm (PSA) to the set of spans for each document in the document set. One or more unique patterns are generated by applying a pattern consolidation algorithm (PCA) to the generated list of suggested patterns. Pattern information for each of the unique patterns is then generated. The pattern information includes a respective first count that corresponds to the number of times each of the unique patterns occurs in the document set.
申请公布号 US2016188537(A1) 申请公布日期 2016.06.30
申请号 US201514837148 申请日期 2015.08.27
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 BHATIA DIMPLE;BROWN ARMAGEDDON R.;LI YUNYAO;ZAGELOW MARGARET
分类号 G06F17/21 主分类号 G06F17/21
代理机构 代理人
主权项 1. A method of suggesting patterns to search documents for information of interest, comprising: acquiring, using a data processing system, a working set of spans for a document set that includes one or more documents; generating, using the data processing system, a list of one or more suggested patterns by applying a pattern suggestion algorithm (PSA) to the set of spans for each document in the document set; generating, using the data processing system, one or more unique patterns by applying a pattern consolidation algorithm (PCA) to the suggested patterns; and generating, using the data processing system, pattern information for each of the unique patterns, wherein the pattern information includes a respective first count that corresponds to the number of times each of the unique patterns occurs in the document set.
地址 ARMONK NY US