发明名称 Augmenting a training set for document categorization
摘要 A method and system for augmenting a training set used to train a classifier of documents is provided. The augmentation system augments a training set with training data derived from features of documents based on a document hierarchy. The training data of the initial training set may be derived from the root documents of the hierarchies of documents. The augmentation system generates additional training data that includes an aggregate feature that represents the overall characteristics of a hierarchy of documents, rather than just the root document. After the training data is generated, the augmentation system augments the initial training set with the newly generated training data.
申请公布号 US2007112753(A1) 申请公布日期 2007.05.17
申请号 US20050273714 申请日期 2005.11.14
申请人 MICROSOFT CORPORATION 发明人 LIU TIE-YAN;MA WEI-YING
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项
地址