海量文本中企业行为或事件的抽取方法,申请号CN201611221430.1-传众专利搜索

首页产品黄页商标征信

会员服务注册登录

法人/股东/高管

发明名称	海量文本中企业行为或事件的抽取方法
摘要	本发明涉及数据挖掘领域，提供一种海量文本中企业行为或事件的抽取方法，该方法包括：数据预处理；词表征；事件向量计算；事件提取分类。本发明提出的技术方案使用向量表示事件和微博，所以基于事件的相似度，本发明能够有效地计算相似度和分类一条新的微博数据。同时，本发明检测微博事件的精确度、召回率、和F值要远优于现有技术中的方法。
申请公布号	CN106611054A	申请公布日期	2017.05.03
申请号	CN201611221430.1	申请日期	2016.12.26
申请人	电子科技大学	发明人	袁华;钱宇;邓雄文;邓文君
分类号	G06F17/30(2006.01)I;G06F17/27(2006.01)I	主分类号	G06F17/30(2006.01)I
代理机构	四川省成都市天策商标专利事务所 51213	代理人	卞涛
主权项	一种海量文本中企业行为或事件的抽取方法，其特征在于包括以下步骤：A、数据预处理从网络中获取目标数据，并将这些数据内容进行预处理形成数据集；B、词表征将预处理形成的数据集中的单词映射到k维的空间向量中，k为预设的维度范围；C、事件向量计算从预处理后的数据集中抽取动词序列，计算所有动词序列的平均词向量，人工标注若干条种子标签，计算同一标签相同事件的平均种子向量；D、事件提取分类通过计算剩余数据集和事件向量的相似度来确定每一条微博数据记录的分类。
地址	610000 四川省成都市高新区（西区）西源大道2006号

您可能感兴趣的专利

Method of manufacturing golf club head

Adaptation of compressed acoustic models

Alkynyl containing hydroxamic acid compounds as matrix metalloproteinase/TACE inhibitors

DICHLORINATED HETEROCYCLIC COMPOUNDS AND METHODS OF SYNTHESIS

Inhibitors of dipeptidyl-aminopeptidase type IV

Secure mounting assembly for a retail product display

Top gate thin-film transistor and method of producing the same

ErbB surface receptor complexes as biomarkers

Control device of combinatorial key operations

Method for surface treatment of lithium manganese oxide for positive electrode in lithium secondary battery

Functionalized mesoporous silicate structures, and related processes

Optical disk recorder for writing data with variable density

Semiconductor memory device having a hierarchical I/O structure

Information recording medium

Method and device for creating a facsimile of an image

Universal ground strap assembly

Ventilated plastic blocks with film laminate

Compressed air drain opening device

Long wear conveyor assembly

Optical pickup tracking controller and optical pickup tracking control method