摘要 |
<P>PROBLEM TO BE SOLVED: To flexibly generate a large amount of test data used for verifying a document file retrieval system at low cost. <P>SOLUTION: A test data generation device performs processing for generating a large amount of test data corresponding to many document files including plural words, that is, data including word frequency and document frequency, on the basis of the so-called Monte Carlo method. The test data generation device sets simulated documents 1 to 1000, and a simulated word a, word b and word c, and sets a total value of the word frequency of each word in a region 61A in a table 61. On the basis of the total value of the word frequency, the word frequency of each word in each document and the document frequency of each word are obtained using a pseudorandom number under the Monte Carlo method. <P>COPYRIGHT: (C)2012,JPO&INPIT |