摘要 |
PROBLEM TO BE SOLVED: To appropriately set a threshold value in determining identity of events expressed by a plurality of documents.SOLUTION: Event information that specifies events is stored in an event DB102 of an event identity determination device 100, and document information of electronic documents which is an extraction source of events is stored in a document DB103. A threshold value determination part 104 refers to the DB102, 103 in advance, calculates statistical data from a collection of the event information and a collection of the document information, and determines a threshold value of similarity between electronic documents to store the value in a threshold value storage part 105 in advance. An identity determination part 101 refers to the event DB102 and reads out event information of a subject to be determined. Based on the read-out event information, electronic documents are read out from the document DB103 and similarity between the electronic documents is calculated, the calculated similarity is compared with a threshold value in the threshold value storage part 105, and identity between the electronic documents is determined. |