发明名称 METHOD, APPARATUS, AND COMPUTER STORAGE MEDIUM FOR AUTOMATICALLY ADDING TAGS TO DOCUMENT
摘要 Embodiments of the present invention provide a method and apparatus for automatically adding tags to a document, the method comprising: determining a plurality of candidate tag words; determining corpus including multiple texts; selecting common words from the corpus as characteristic words; for each characteristic word and candidate tag word, determining a co-occurrence probability with witch, in the case of the characteristic word being occurred, the candidate tag word is occurred at the same time; abstracting the characteristic words from the document, and for each abstracted characteristic word, calculating the weight value of this characteristic word; in the corpus, for the candidate tag words, counting a weighted co-occurrence probability of the candidate tag words and all of the characteristic words occurred in the document; and selecting the candidate tag word with the highest weighted co-occurrence probability as the tag word to be added to the document. Embodiments of the present invention can realize intellectualization for adding tags to a document, and the tags are not limited to the key words occurred in the document.
申请公布号 EP2801917(A4) 申请公布日期 2015.08.26
申请号 EP20120864434 申请日期 2012.12.17
申请人 TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED 发明人 HE, XIANG;WANG, YE;JIAO, FENG
分类号 G06F17/24;G06F17/21;G06F17/27 主分类号 G06F17/24
代理机构 代理人
主权项
地址