摘要 |
Disclosed are a web page classifying method and device. The web page classifying method of the present invention comprises: establishing a feature word classifier according to a web page sample set, wherein the web page address sample set comprises: multiple sample web page addresses and web page types corresponding to the sample web page addresses; acquiring a preset number of web page addresses, and determining through the feature word classifier the web page type of each web page address; performing redundancy elimination on the web page addresses with the determined web page types to obtain structural character strings, the structural character strings being web page address structures; storing the web page address structures and the corresponding web page types; and when web pages are to be classified, acquiring web page addresses of the web pages to be classified, performing redundancy elimination on the web page addresses to obtain corresponding web page address structures, and using the web page address structures to find from the storage the web page types of the web pages to be classified. The method of the present invention realizes rapid and efficient classification of web pages. |