发明名称 Method and system for disambiguating informational objects
摘要 The present invention provides a Distinct Author Identification System (“DAIS”) for disambiguating data to discern author entities and link or associate authorships with such author entities. The invention provides powerful disambiguation processes applied across one or more databases to yield a disambiguated authority database of authors. An entire database of publications may be processed by the DAIS to group/link authorships and to identify author entities. The author entities may then be matched or associated with actual authors to establish an authority database of authors. After initial evaluation, the DAIS may be used to reevaluate some or all of the database(s) and/or the authority database established by the DAIS may be used to add or update information. DAIS may use “hierarchical clustering” to link authorships and identify authors based on authorship similarity. DAIS evaluates the likelihood that authorships are from the same author.
申请公布号 US9183290(B2) 申请公布日期 2015.11.10
申请号 US201113118390 申请日期 2011.05.28
申请人 Thomas Reuters Global Resources 发明人 Griffith Robert A.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 Valenti, Hanley & Robinson, PLLC 代理人 Valenti, Hanley & Robinson, PLLC ;Duncan Kevin T.
主权项 1. A computer-implemented method comprising: a. receiving by a computer a set of electronic information associated with a set of publications, each publication in the set of publications comprising at least one cited reference to an other publication and having at least one authorship relating to the other publication; b. comparing by a computer at least a portion of the set of electronic information with authorship data contained in an authority database, the authorship data related to authorship entities represented in the authority database; c. linking the at least one authorship to the one or more authorship entities based on determining an authorship similarity between the at least one authorship and the one or more authorship entities; and d. associating by a computer the set of electronic information with one or more authorship entities having been previously defined at least in part using a disambiguation process and previously stored in the authority database, wherein the set of electronic information is received subsequent to the disambiguation and storing process, the at least one authorship being linked to a previously defined cluster of authorships.
地址 CN