发明名称 |
METHOD FOR AUTOMATICALLY GENERATING REGULAR EXPRESSIONS FOR RELAXED MATCHING OF TEXT PATTERNS |
摘要 |
A method for automatically generating regular expressions for relaxed matching of text patterns. A received input phrase expressed in a natural language is determined to be a plain text pattern. The plain text pattern is automatically tokenized, thereby generating a first token list. Rules loaded from a predefined rule set are automatically applied to the first token list in an order specified by the predefined rule set to automatically modify a token list by applying a replace word, split-at-character or whitespace operator. The modified token list is automatically converted into a regular expression that matches the plain text pattern and one or more variations of the plain text pattern. A utilization of the regular expression for an information extraction facilitates a recall and a precision of the information extraction.
|
申请公布号 |
US2009070327(A1) |
申请公布日期 |
2009.03.12 |
申请号 |
US20070850987 |
申请日期 |
2007.09.06 |
申请人 |
LOESER ALEXANDER STEPHAN;RAGHAVAN SRIRAM;VAITHYANATHAN SHIVAKUMAR |
发明人 |
LOESER ALEXANDER STEPHAN;RAGHAVAN SRIRAM;VAITHYANATHAN SHIVAKUMAR |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|