发明名称 Fast title/summary extraction from long descriptions
摘要 Techniques are described herein for automatic generation of a title or summary from a long body of text. A grammatical tree representing one or more sentences of the long body of text is generated. One or more nodes from the grammatical tree are selected to be removed. According to one embodiment, a particular node is selected to be removed based on its position in the grammatical tree and its node-type, where the node type represents a grammatical element of the sentence. Once the particular node is selected, a branch of the tree is cut at the node. After branch has been cut, one or more sub-sentences are generated from the remaining nodes in the grammatical tree. The one or more sub-sentences may be returned as a title or summary.
申请公布号 US9317595(B2) 申请公布日期 2016.04.19
申请号 US201012960828 申请日期 2010.12.06
申请人 Yahoo! Inc. 发明人 Li Xin;Zhao Hongjian
分类号 G06F17/28;G06F17/30;G06F17/27 主分类号 G06F17/28
代理机构 Hickman Palermo Becker Bingham LLP 代理人 Hickman Palermo Becker Bingham LLP
主权项 1. A method for generating shorter versions of one or more sentences, the method comprising: receiving, by a computing device, the one or more sentences from a text; based on the one or more sentences, generating, by the computing device within a memory of the computing device, a tree comprising a plurality of nodes; wherein the tree represents the one or more sentences; wherein each node, of the plurality of nodes, represents a grammatical element of the one or more sentences; using a named entity recognizer, automatically recognizing which nodes, of the plurality of nodes, represent grammatical elements that correspond to recognized named entities; selecting a particular node from the tree based, at least in part, on (a) which nodes in the tree correspond to recognized named entities, (b) a position of the particular node within the tree, and (c) a node type of the particular node; wherein the particular node has no children within the tree that correspond to recognized named entities identified by the named entity recognizer; after said selecting, modifying the tree within the memory of the computing device by removing the particular node and children of the particular node from the tree; after removing, from the tree, the particular node and the children of the particular node, generating, from remaining nodes of the tree a first set of one or more sub-sentences; wherein the first set of one or more sub-sentences are shorter in length than said one or more sentences; causing display of at least one sub-sentence generated from remaining nodes of the tree to a user.
地址 Sunnyvale CA US