发明名称 SYSTEMS AND METHODS FOR RETRIEVING TABULAR DATA FROM TEXTUAL SOURCES
摘要 Tables form an important kind of data element in text retrieval. Often, the gist of an entire news article or other exposition can be concisely captured in tabular form. Information other than the key words in a digital document can be exploited to provide the users with more flexible and powerful query capabilities. More specifically, the structural information in a document is exploited to identify tables and their component fields and let the users query based on these fields. Component fields can include table lines, caption lines, row headings, column headings, or other table components. Empirical results have demonstrated that heuristic method based table extraction and component tagging can be performed effectively and efficiently. Moreover, experiments in retrieval using the system of the present invention strongly indicate that such structural decomposition can facilitate better representation of user's information needs and hence more effective retrieval of tables.
申请公布号 WO9905623(A1) 申请公布日期 1999.02.04
申请号 WO1998US15287 申请日期 1998.07.23
申请人 SOVEREIGN HILL SOFTWARE, INC. 发明人 PYREDDY, PALLAVI;CROFT, W., BRUCE
分类号 G06F17/30;G06K9/20;(IPC1-7):G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址
您可能感兴趣的专利