摘要 |
PROBLEM TO BE SOLVED: To generate time series data by collecting webpages including arguments drawing a user's attention, and associating and arranging the arguments included in the collected webpages. SOLUTION: This method includes: collecting webpages suitable for collection conditions designated by a user (S2), dividing the set of the collected pages into a plurality clusters on the basis of URL information of the pages (S7), extracting date expressions from the pages included in each of those divided clusters, determining a date expression form representing each cluster on the basis of the extracted date expressions (S9, S10), dividing the pages included in the cluster into a plurality of items on the basis of the date expression form decided for each cluster (S11), and rearranging the divided items for each cluster according to the order of time on the basis of the date expressions corresponding to the items to generate time series data for each cluster (S13). COPYRIGHT: (C)2007,JPO&INPIT
|