主权项 |
1. A computer-implemented method of constructing a grammar comprising:
training, by a computing device, a classifier using a confirmed annotation, the training comprising updating a corpus of the classifier with a feature extracted from the confirmed annotation, the feature comprising a hypernym of the confirmed annotation and at least one word of the confirmed annotation, the at least one word being adjacent to the hypernym in the confirmed annotation, and wherein extracting the feature comprises: obtaining a concatenated hypernym by concatenating at least two hypernyms of the confirmed annotation, obtaining a sequence comprising the concatenated hypernym, and extracting the hypernym from a substring of the confirmed annotation corresponding to the sequence; selecting, by a computing device, a digital text sample to annotate; transforming, by the computing device, the text sample into a set of annotation candidates; scoring, by the computing device, the set of annotation candidates using the classifier to obtain a set of annotation scores respectively for the set of annotation candidates; selecting, by the computing device, one of the annotation candidates in the set of annotation candidates as a suggested annotation for the text sample based on the set of annotation scores; deriving, by the computing device, an annotation-derived grammar rule based on the suggested annotation; and configuring, by the computing device, a digital grammar to include the annotation-derived grammar rule. |