发明名称 METHOD, APPARATUS AND SYSTEM FOR EXTRACTING WEBPAGE CONTENT
摘要 The present disclosure relates to a method, an apparatus and a system for extracting webpage content. The method for extracting webpage content includes: responding to a webpage browsing instruction triggered on a browser by a mobile client to obtain a corresponding webpage; parsing the webpage to obtain a DOM node of a tag in a webpage script; obtaining a plug-in tag node from the DOM node; and when a plug-in tag corresponding to the plug-in tag node is a predetermined type tag, extracting a plug-in resource that corresponds to the plug-in tag. The present disclosure can complete extracting of content that complies with a specific protocol specification when a webpage has not been truly rendered, thereby improving a speed of extracting predetermined webpage content and also improving a webpage display speed. In addition, because this solution can implement extracting of a plug-in resource on the side of a browser terminal without relying on a background server, this solution is technically easy for implementation and can reduce development costs.
申请公布号 WO2015127882(A1) 申请公布日期 2015.09.03
申请号 WO2015CN73167 申请日期 2015.02.16
申请人 TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED 发明人 GUO, XINHUA;SU, KE;MA, NING;WANG, JINGYAO
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址