发明名称 Systems, methods, and software for assessing ambiguity of medical terms
摘要 Some known medical terms may function as non-medical terms depending on their particular context. Accordingly, the present inventors devised systems, methods, and software that facilitate determining whether a term that is found in a medical corpus is likely to be a medical term when found in another corpus. An exemplary embodiment receives a term and computes an ambiguity score based on language models for a medical and a non-medical corpus.
申请公布号 US9317601(B2) 申请公布日期 2016.04.19
申请号 US200611538583 申请日期 2006.10.04
申请人 Thomson Reuters Global Resources 发明人 Dozier Christopher C.;Chaudhary Mark;Kondadadi Ravi
分类号 G06F17/30;G06F17/27 主分类号 G06F17/30
代理机构 Egan Greenwald, PLLC 代理人 Galloway Duncan;Egan Greenwald, PLLC ;Duncan Kevin T.
主权项 1. A computer-implemented method comprising: receiving a term; determining by the computer an ambiguity score for the term, wherein the ambiguity score is on ratio of a probability of the term and at least first, second and third language models of a plurality of language models, and wherein the first language model is based on a medical corpus of documents and the second language model is based on a general news corpus of documents and the third language model is based on a legal corpus of documents, wherein the ambiguity score for the term is determined using the function:Stn=λ1⁢log⁡(P⁡(tn|M2))log⁡(P⁡(tn|M1))+λ2⁢log⁡(P⁡(tn|M3))log⁡(P⁡(tn|M1)) where Stnis the ambiguity score for term tn,λ1 is a first constant, λ2 is a second constant, P is a function of probability, M1 is the first language model, M2 is the second language model, and M3 is a third language model; and outputting by the computer the ambiguity score for the term, wherein the ambiguity score for the term is outputted as ranked list, with each score associated with corresponding terms.
地址 CH