System and method for generating personal vocabulary from network data,申请号US200912571404-传众专利搜索

发明名称	System and method for generating personal vocabulary from network data
摘要	A method is provided in one example and includes receiving data propagating in a network environment, and identifying selected words within the data based on a whitelist. The whitelist includes a plurality of designated words to be tagged. The method further includes assigning a weight to the selected words based on at least one characteristic associated with the data, and associating the selected words to an individual. A resultant composite is generated for the selected words that are tagged. In more specific embodiments, the resultant composite is partitioned amongst a plurality of individuals associated with the data propagating in the network environment. A social graph can be generated that identifies a relationship between a selected individual and the plurality of individuals based on a plurality of words exchanged between the selected individual and the plurality of individuals.
申请公布号	US8990083(B1)	申请公布日期	2015.03.24
申请号	US200912571404	申请日期	2009.09.30
申请人	Cisco Technology, Inc.	发明人	Gannu Satish K.;Malegaonkar Ashutosh A.;Mihailovici Virgil N.
分类号	G10L15/00	主分类号	G10L15/00
代理机构	Patent Capital Group	代理人	Patent Capital Group
主权项	1. A method, comprising: receiving data propagating in a network environment at a streaming database feeder; ignoring Joint Photographic Experts Group (JPEG) documents in the data; updating tags for each user in the network environment using a user-sub stream created for the user by the streaming database feeder, wherein each user-sub stream includes at least a portion of the data propagating in the network environment, wherein the tags are words and phrases that are associated with each user, wherein the data includes documents and, for at least a portion of the documents in the data, each original document is copied to create an anonymous document and a document that contains selected words within the data based on a whitelist, wherein the whitelist includes a plurality of designated words to be tagged, wherein documents that include data in a blacklist are dropped, and wherein the anonymous documents contain a concept field and some of the data in the anonymous documents is selected for the whitelist, and wherein the document that contains selected words does not include the concept field; assigning a weight to the selected words based on at least one characteristic associated with the data; associating the selected words to an individual, wherein the weight for a selected word is higher if the individual propagates the data; and generating a resultant composite of the selected words that are tagged.
地址	San Jose CA US