发明名称 System and method for good nearest neighbor clustering of text
摘要 An improved system and method for clustering text or content described by text is provided. Each text in a set of texts may be represented as a dimensional vector of words. Singleton texts that may not be similar to another text may be excluded from the set of texts for clustering. Texts identified as good nearest neighbors may then be grouped in the same cluster. In addition, metadata describing content may be used for clustering items of aggregated content from content feeds. Metadata describing items of content from content feeds may be converted into a set of texts and texts identified as good nearest neighbors may then be clustered. Items of content feeds described by the clustered texts may then be similarly clustered. Any types of items of content that may be described by text may be clustered, including audio, images, video, multimedia content, and so forth.
申请公布号 US2007244874(A1) 申请公布日期 2007.10.18
申请号 US20060390001 申请日期 2006.03.27
申请人 YAHOO! INC. 发明人 TAWDE VIVEK B.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址