发明名称 Indexing for Regular Expressions in Text-Centric Applications
摘要 A method, system, and article are provided for evaluating regular expressions over large data collections. A general purpose index is built to handle complex regular expressions at the character level. Characters, character classes, and associated metadata are identified and stored in an index of a collection of documents. Given a regular expression, a query is generated based on the contents of the index. This query is executed over the index to identify a set of documents in the collection of documents over which the regular expression can be evaluated. Based upon the query execution, the identified set of documents is returned for evaluation by the regular expression responsive to execution of the query over the index.
申请公布号 US2010174718(A1) 申请公布日期 2010.07.08
申请号 US20090348594 申请日期 2009.01.05
申请人 INTERNATIONAL BUSINESS MACHINES CORPORATION 发明人 CHEN TING;KRISHNAMURTHY RAJASEKAR;VAITHYANATHAN SHIVAKUMAR
分类号 G06F7/06;G06F17/30 主分类号 G06F7/06
代理机构 代理人
主权项
地址