发明名称 METHOD AND DEVICE FOR IDENTIFYING CONSISTENCY OF WEB INFORMATION
摘要 An embodiment of the invention discloses a method and a device for identifying consistency of webpage information. The identifying method includes acquiring a first kind of webpage information from a database; extracting header information and attribute information from the webpage information, subjecting the header information and the attribute information to word segmentation analysis to obtain attributes of a describing object; counting an attribute value of each attribute and co-occurrence information under a category, to which the describing object belongs; removing attribute values in the co-occurrence information from the attribute values to obtain a contradiction attribute value contained in each attribute; judging whether the attribute values of the header information and the attribute information in the identified webpage information are attribute values under the same attribute of the describing object; if the attribute values of the header information and the attribute information in the identified webpage information are attribute values under the same attribute of the describing object, determining the identified webpage information to be inconsistent, and otherwise, determining the identified webpage information to be consistent. According to the method and the device for identifying consistency of the webpage information, consistency of the webpage information can be identified, and identification efficiency is improved.
申请公布号 HK1201342(A1) 申请公布日期 2015.08.28
申请号 HK20150101698 申请日期 2015.02.16
申请人 ALIBABA GROUP HOLDING LIMITED 发明人 WEI, HUI;FENG, JINGHUA;CHEN, MINGXIU
分类号 G06F 主分类号 G06F
代理机构 代理人
主权项
地址