发明名称 METHOD AND SYSTEM FOR DISAMBIGUATING INFORMATIONAL OBJECTS
摘要 The present invention provides a Distinct Author Identification System (“DAIS”) for disambiguating data to discern author entities and link or associate authorships with such author entities. The invention provides powerful disambiguation processes applied across one or more databases to yield a disambiguated authority database of authors. An entire database of publications may be processed by the DAIS to group/link authorships and to identify author entities. The author entities may then be matched or associated with actual authors to establish an authority database of authors. After initial evaluation, the DAIS may be used to reevaluate some or all of the database(s) and/or the authority database established by the DAIS may be used to add or update information. DAIS may use “hierarchical clustering” to link authorships and identify authors based on authorship similarity. DAIS evaluates the likelihood that authorships are from the same author.
申请公布号 US2016196332(A1) 申请公布日期 2016.07.07
申请号 US201514936646 申请日期 2015.11.09
申请人 Thomson Reuters Global Resources 发明人 Griffith Robert A.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A content management system in communication with one or more publications databases, each comprising a plurality of publications, and with a plurality of remote users, the content management system comprising: a disambiguation computer; a disambiguation database operatively connected to the disambiguation computer and adapted to receive and store for processing by the disambiguation computer at least a first set of information derived from one or more publications databases each comprising a plurality of publications with each publication having at least one cited reference and one or more authorships; an authorship similarity routine executing on the disambiguation computer and adapted to process at least some of the first set of electronic information based on cited reference data from the plurality of publications to determine a degree of authorship similarity; a linking routine executing on the disambiguation computer and adapted to link authorships based on the degree of authorship similarity; and a clustering routine executing on the disambiguation computer and adapted to cluster two or more linked authorships to form a first cluster and adapted to form a first author entity associated with the first cluster, whereby the clustering routine is executed to produce an authority database of authors operatively stored on the disambiguation database and comprised of a plurality of unique author entities each associated with a unique actual author and a cluster.
地址 Baar CH