发明名称 NAMED ENTITY RECOGNITION METHODS AND APPARATUS
摘要 There is disclosed a method of recognising named entities in a text-containing document, represented by text document data. The received text document data comprising a plurality of tokens, one or more of the said plurality of tokens being part of a plurality of entities. The text document data is analysed using one or more tagging modules which are operable to determine token label data in respect of at least the tokens which are part of a plurality of entities, wherein the token label data output by the one or more tagging modules comprises data representative of the location of the token within each of a plurality of entities. The token label data representative of the location of the token within each of a plurality of entities is used to determine the beginning and end of the entities which have been identified in the text document data. A plurality of tagging modules may be employed, each of which is adapted to determine token label data representative of the location of a token within a different subset of the entities represented by the text document data, wherein the token label data determined by the plurality of tagging modules together is representative of the location of the said token with a plurality of entities. A single tagging module may be employed which determines a compound tag selected from a group of compound tags, the ground of compound tags including different tags in respect of a plurality of different combinations of the location of a respective token within a plurality of entities.
申请公布号 US2009249182(A1) 申请公布日期 2009.10.01
申请号 US20080059247 申请日期 2008.03.31
申请人 ITI SCOTLAND LIMITED 发明人 SYMINGTON BEATRICE;HADDOW BARRY
分类号 G06F17/21 主分类号 G06F17/21
代理机构 代理人
主权项
地址