摘要 |
An interactive interface facilitates the conversion of unstructured documents into XML-compliant documents. A document is parsed to identify fact items in the content of the document. A classifier associates initial labels with an identified fact items, and the fact items and associated initial labels are forwarded to a user for review and correction. An interface executing on a client computer presents the initial labels associated with fact items, and enables a user to correct the labels associated with the identified fact items. Upon receipt of corrected labels from the user, the classifier is trained to update probable associations of labels and fact items in accordance with the corrected labels. The interface enables the user to enter new labels and/or concepts for a taxonomy, and an extension to the taxonomy is automatically generated. |