发明名称 |
Discovering Title Information for Structured Data in a Document |
摘要 |
A method, system, and computer program product for discovering title information for structured data in a document are provided in the illustrative embodiments. An instance of structured data is identified in a document. A search direction is identified relative to a location of the instance, wherein a title describing the instance is located in a document portion in the search direction from the instance. A sentence is selected in the document portion. A determination is made whether the selected sentence qualifies as a title by determining whether an independent clause in the selected sentence includes a verb-phrase. Responsive to the selected sentence qualifying as the title, the selected sentence is designated as a candidate title for the instance. |
申请公布号 |
US2014244676(A1) |
申请公布日期 |
2014.08.28 |
申请号 |
US201313778901 |
申请日期 |
2013.02.27 |
申请人 |
INTERNATIONAL BUSINESS MACHINES CORPORATION |
发明人 |
Byron Donna Karen;Pikovsky Alexander;Sanchez Matthew B. |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A method for discovering title information in a document, the method comprising:
identifying an instance of structured data in a document; identifying a search direction relative to a location of the instance, wherein a title describing the instance is located in a document portion in the search direction from the instance; selecting a sentence in the document portion; determining whether the selected sentence qualifies as a title by determining whether an independent clause in the selected sentence includes a verb-phrase; and designating, responsive to the selected sentence qualifying as the title, the selected sentence as a candidate title for the instance. |
地址 |
Armonk NY US |