一种新词发现方法和系统,申请号CN201110138042.8-传众专利搜索

首页产品黄页商标征信

会员服务注册登录

法人/股东/高管

发明名称	一种新词发现方法和系统
摘要	本发明提供一种新词发现方法和系统，基于bigram语言模型分别提取了前背景语料的bigram元素，并分别得到前背景语料的统计信息，利用统计信息及第一预设规则过滤bigram元素，再利用n-gram语言模型及第二预设规则对剩余的bigram元素进行前景语料中的扩展，n-gram元素的更新不需要对背景语料重新计算，避免对背景语料里已有新词重新发现，利用第二预设规则判别新词的边界，去除垃圾bigram元素和n-gram元素，简单易用，减少了人工校对的负担。
申请公布号	CN102231153A	申请公布日期	2011.11.02
申请号	CN201110138042.8	申请日期	2011.05.25
申请人	盛乐信息技术（上海）有限公司	发明人	吴悦
分类号	G06F17/30(2006.01)I;G06F17/27(2006.01)I	主分类号	G06F17/30(2006.01)I
代理机构	上海思微知识产权代理事务所(普通合伙) 31237	代理人	菅秀君
主权项	一种新词发现方法，其特征在于，包括：根据bigram语言模型抽取已知背景语料的bigram元素，并统计所述已知背景语料中所有bigram元素的词频和与种数；根据bigram语言模型抽取前景语料的bigram元素，并统计所述前景语料中所有bigram元素的词频和与种数；根据上述所有统计确定所述前景语料中符合第一预设规则的bigram元素；根据n‑gram语言模型对所述前景语料中剩余的bigram元素进行在所述前景语料中的向前和向后扩展，得到所述前景语料的n‑gram元素，确定所述前景语料中符合第二预设规则的n‑gram元素，得到新词列表。
地址	201203 上海市浦东新区张江高科技园区郭守敬路356号3幢102室

您可能感兴趣的专利

Development of a gel-free molecular sieve based on self-assembled nano-arrays

Process and apparatus for removing dissolved and undissolved solids from liquids

Crystal growth method for nitride semiconductor, nitride semiconductor light emitting device, and method for producing the same

Method of manufacturing trapezoid-shaped plastic zipper bags

Transmission shaft set

Plate-type heat exchanger

Animal control system

Method and apparatus for crushing and sorting cans

Mattress handle formed of a textile web with cushioned edges

Method of using cache to determine the visibility to a remote database client of a plurality of database transactions

Mattress retainer for adjustable bed

Electrical breadboard assembly

Fusion polypeptides of human serum albumin and a therapeutically active polypeptide

Methods and compositions useful for modulation of angiogenesis and vascular permeability using SRC or Yes tyrosine kinases

Method of coding and decoding image

Prognostic methods for prediction of progression of normal and hyperplastic mammary cells to carcinoma

Multi-beam source unit, adjusting method for same, positioning jig for same, assembling method for same, and image forming apparatus having same

System, method and apparatus for communicating information between a mobile communications device and a bar code reader

Drug delivery and monitoring system