主权项 |
1. A computer implemented natural language processing method, comprising:
receiving a text; identifying a set of linguistic characteristics contained in the text, wherein linguistic characteristics include grammatical, syntactic, and idiomatic features of the text; determining a plurality of locations of origin in which the text was potentially written based on the set of linguistic characteristics; retrieving a set of reference documents for each location of origin in the plurality of locations of origin, in response to the determining the plurality of locations in which the text was potentially written; determining a plurality of time periods in which the text was potentially written based on the set of linguistic characteristics; retrieving a set of reference documents for each time period in the plurality of time periods in response to the determining the plurality of time periods in which the text was potentially written; producing a set of proximity scores by performing a set of proximity checks using the set of linguistic characteristics, the set of reference documents, and the text, wherein the proximity checks analyze how often and how close linguistic characteristics are to one another; ranking the plurality of locations of origin based on the set of proximity scores; ranking the plurality of time periods based on the set of proximity scores; and returning a set of one or more ranked locations of origin of the plurality of locations of origin and a set of one or more ranked time periods of the plurality of time periods. |