摘要 |
Techniques for automatically detecting similar subsets (e.g., fragments) in electronic documents such as dynamic content-based data, e.g., web pages. The techniques of the invention may perform a systematic analysis of the web pages with respect to one or more of their information sharing behavior, their personalization characteristics, and their change pattern over time. Thus, the invention may be applied to discover fragments in web pages of web sites that are most beneficial for caching the contents of the web site. The present invention also comprises techniques for publishing electronic documents with automatic fragment detection.
|