TEXT CONTENT EXTRACTION METHOD AND DEVICE,申请号WO2013CN80666-传众专利搜索

首页产品黄页商标征信

会员服务注册登录

法人/股东/高管

发明名称	TEXT CONTENT EXTRACTION METHOD AND DEVICE
摘要	Disclosed are a text content extraction method and device. The method comprises: dividing an input HTML webpage into a plurality of modules, determining the location score of each module according to the location of each module in the layout of the webpage, and calculating the text length of each module; extracting the link address contained in each module, counting a character content which has the highest usage frequency in all the link addresses, marking each link address which contains the character content as a valid link, and marking each link address which does not contain the character content as an invalid link; and according to comprehensive score = location score × (text length + word length of valid link)/word length of invalid link, determining the comprehensive score of each module, and judging the module the comprehensive score of which goes beyond a set threshold value as a content module. The method described in the present invention can effectively remove redundant information about a non-content portion in a webpage, and extract the effective content of the webpage more accurately.
申请公布号	WO2013178193(A3)	申请公布日期	2014.01.23
申请号	WO2013CN80666	申请日期	2013.08.01
申请人	ZTE CORPORATION	发明人	YE, WEI
分类号	G06F17/30	主分类号	G06F17/30
代理机构		代理人
主权项
地址

您可能感兴趣的专利

Implant device, tool, and methods relating to treatment of paranasal sinuses

Cyclist power link

Reduction of HMF ethers with metal catalyst

Low ice pneumatic motor exhaust muffler

Devices and methods for controlling tremor

Thiophene-substituted tetracyclic compounds and methods of use thereof for treatment of viral diseases

Large rolling bearing

Negative pressure device

Bran and germ flavor and texture improvement

Determining product categories by mining chat transcripts

Quick connect valve actuator

Method for providing a mat containing aerogel and apparatus for implementing such method

Method and apparatus for controlling haptic feedback of an input tool for a mobile terminal

Composite refractory for an inner lining of a blast furnace

A treatment device

High solids aqueous mineral and/or filler and/or pigment suspension in acidic pH environment

Door for refrigerator and method for manufacturing the same

Unified wagering data model

BACE1 inhibitors

Encoding method and decoding method