发明名称 |
Unsupervised extraction of facts |
摘要 |
A system and method for extracting facts from documents. A fact is extracted from a first document. The attribute and value of the fact extracted from the first document are used as a seed attribute-value pair. A second document containing the seed attribute-value pair is analyzed to determine a contextual pattern used in the second document. The contextual pattern is used to extract other attribute-value pairs from the second document. The extracted attributes and values are stored as facts. |
申请公布号 |
US9558186(B2) |
申请公布日期 |
2017.01.31 |
申请号 |
US201414460117 |
申请日期 |
2014.08.14 |
申请人 |
Google Inc. |
发明人 |
Betz Jonathan T.;Zhao Shubin |
分类号 |
G06F17/27;G06F17/30 |
主分类号 |
G06F17/27 |
代理机构 |
Brake Hughes Bellermann LLP |
代理人 |
Brake Hughes Bellermann LLP |
主权项 |
1. A computer-implemented method for extracting facts, the method comprising:
at a computer system including one or more processors and memory storing one or more programs, the one or more processors executing the one or more programs to perform the operations of:
identifying a first fact having an attribute and a value obtained from a first document;retrieving a second document that contains the attribute and the value of the first fact;identifying in the second document a contextual pattern associated with the attribute and value of the first fact;extracting a second fact from the second document using the contextual pattern, the second fact having an attribute that is different than the attribute of the first fact and having a value that is different than the value of the first fact; andstoring the first fact and the second fact in a fact repository of the computer system. |
地址 |
Mountain View CA US |