摘要 |
A system and method are disclosed for extracting information from text which can be performed without prior knowledge as to whether the text includes a list. The method applies parser rules to a sentence spanning lines of text to identify a set of candidate list items in the sentence. Each candidate list item is assigned a set of features including one or more non-linguistic feature and a linguistic feature. The linguistic feature defines a syntactic function of an element of the candidate list item that is able to be in a dependency relation with an element of an identified candidate list introducer in the same sentence. When two or more candidate list items are found with compatible sets of features, a list is generated which links these as list items of a common list introducer. Dependency relations are extracted between the list introducer and list items and information based on the extracted dependency relations is output. |