发明名称 |
Systems, methods, and software for assessing ambiguity of medical terms |
摘要 |
Some known medical terms may function as non-medical terms depending on their particular context. Accordingly, the present inventors devised systems, methods, and software that facilitate determining whether a term that is found in a medical corpus is likely to be a medical term when found in another corpus. An exemplary embodiment receives a term and computes an ambiguity score based on language models for a medical and a non-medical corpus. |
申请公布号 |
US9317601(B2) |
申请公布日期 |
2016.04.19 |
申请号 |
US200611538583 |
申请日期 |
2006.10.04 |
申请人 |
Thomson Reuters Global Resources |
发明人 |
Dozier Christopher C.;Chaudhary Mark;Kondadadi Ravi |
分类号 |
G06F17/30;G06F17/27 |
主分类号 |
G06F17/30 |
代理机构 |
Egan Greenwald, PLLC |
代理人 |
Galloway Duncan;Egan Greenwald, PLLC ;Duncan Kevin T. |
主权项 |
1. A computer-implemented method comprising:
receiving a term; determining by the computer an ambiguity score for the term, wherein the ambiguity score is on ratio of a probability of the term and at least first, second and third language models of a plurality of language models, and wherein the first language model is based on a medical corpus of documents and the second language model is based on a general news corpus of documents and the third language model is based on a legal corpus of documents,
wherein the ambiguity score for the term is determined using the function:Stn=λ1log(P(tn|M2))log(P(tn|M1))+λ2log(P(tn|M3))log(P(tn|M1)) where Stnis the ambiguity score for term tn,λ1 is a first constant, λ2 is a second constant, P is a function of probability, M1 is the first language model, M2 is the second language model, and M3 is a third language model; and outputting by the computer the ambiguity score for the term, wherein the ambiguity score for the term is outputted as ranked list, with each score associated with corresponding terms. |
地址 |
CH |