主权项 |
1. A device, comprising:
one or more processors to:
obtain text of a document to be analyzed to identify glossary terms included in the text;perform a linguistic unit analysis on a linguistic unit, included in the text, to generate a plurality of ambiguous linguistic units from the linguistic unit;resolve the plurality of ambiguous linguistic units to generate a set of potential glossary terms that includes a subset of the plurality of ambiguous linguistic units;perform a glossary term analysis on the set of potential glossary terms to generate a set of glossary terms that includes a subset of the set of potential glossary terms;identify a set of included terms, of the set of potential glossary terms, that are included in the set of glossary terms;identify a set of excluded terms, of the set of potential glossary terms, that are excluded from the set of glossary terms;determine a semantic relatedness score between at least one excluded term, of the set of excluded terms, and at least one included term, of the set of included terms;selectively add the excluded linguistic term to the set of glossary terms to form a final set of glossary terms based on the semantic relatedness score; andoutput the final set of glossary terms for the document.
|