发明名称 |
System and method for extraction of factoids from textual repositories |
摘要 |
A method ( 400 ) is disclosed of extracting factoids from text repositories, with the factoids being associated with a given factoid category. The method ( 400 ) starts by training a classifier ( 230 ) to recognise factoids relevant to that given factoid category. Documents or document summaries relevant to the given factoid category is next collected ( 410 ) from the text repositories. Sentences having a predetermined association to the given factoid category is extracted ( 420 ) from the documents or said document summaries. Those sentences are classified ( 440 ), in a noisy environment, using the classifier ( 230 ) to extract snippets containing phrases relevant to the given factoid category. It is the extracted snippets that are the factoid associated with the given factoid category. |
申请公布号 |
US2007162447(A1) |
申请公布日期 |
2007.07.12 |
申请号 |
US20050321177 |
申请日期 |
2005.12.29 |
申请人 |
INTERNATIONAL BUSINESS MACHINES CORPORATION |
发明人 |
JOSHI SACHINDRA;KRISHNAPURAM RAGHURAM;KUMAR NIMIT;MEHTA KIRAN;NEGI SUMIT;RAMAKRISHNAN GANESH;HOLMES SCOTT R. |
分类号 |
G06F7/00 |
主分类号 |
G06F7/00 |
代理机构 |
|
代理人 |
|
主权项 |
|
地址 |
|