摘要 |
A computer implemented natural language processing method is disclosed. The method includes the steps of analysing a sentence string within textual information to determine sub-components of the sentence string; assigning one or more unique tokens to each determined sub-component; determining a probability of use that a determined sub-component has one or more specific meanings; based on the determined probability of use, creating a valid set of unique tokens that are associated with the sentence string and linking verb sub-components associated with one or more of the unique tokens in the valid set of unique tokens to a pre-defined limited sub-set of verbs to create an identification tuple that maps onto the sub-set of verbs. Also disclosed is a natural language processing system including a text processing module arranged to analyse a sentence string within textual information to determine sub-components of the sentence string; a parsing and semantic processing module arranged to assign one or more unique tokens to each determined sub-component, determine a probability of use that a determined sub-component has one or more specific meanings, and based on the determined probability of use, create a valid set of unique tokens that are associated with the sentence string, and a lexicon module arranged to contain links for each verb sub-component such that each link associates a verb sub-component with a pre-defined limited sub-set of verbs to enable the parsing and logic module to create an identification tuple that maps onto the sub-set of verbs.
|