摘要 |
PROBLEM TO BE SOLVED: To highly precisely discriminate the consistency of a group of pieces of event information extracted from a group of documents at the time of extracting a set of pieces of event information on an identical event from the group of documents.SOLUTION: An event information extraction system 1 that discriminates the consistency of plural pieces of event information extracted from a group of documents includes a feature vector production unit 20 that produces feature vectors of an event name of event information, a site of the event, a date of the event, and the contents of a document on the basis of event information, which contains the event name, site of the event, and date of the event extracted from the group of documents, and the contents of a document disclosing the event information, a similarity calculation unit 30 that calculates a similarity between pieces of event information on the basis of the similarities of the feature vectors of the event name, site of the event, and date of the event, and contents of a document between a pair of pieces of event information of all pieces of event information extracted from the group of documents, and a consistency discrimination calculation unit 40 that discriminates a pair of pieces of event information, the similarity between which exceeds a threshold, as an identical event, and includes the event in a set of identical events.SELECTED DRAWING: Figure 1 |