摘要 |
A phrase identification system and method are provided. The method comprises: identifying one or more phrase candidates in the electronic document; selecting one of the phrase candidates; numerically representing features of the selected phrase candidates to obtain a numeric feature representation associated with that phrase candidate; and inputting the numeric feature representation into a machine learning classifier, the machine learning classifier being configured to determine, based on each numeric feature representation, whether the phrase candidate associated with that numeric feature representation is a phrase. |