发明名称 |
Multilabel classification by a hierarchy |
摘要 |
A technique of extracting hierarchies for multilabel classification. The technique can process a plurality of labels related to a plurality of documents, using a clustering process, to cluster the labels into plurality of clusterings representing a plurality of classes. The technique classifies the documents and predicts a plurality of performance characteristics, respectively, for the plurality of clusterings. The technique selects at least one of the clusterings using information from the performance characteristics and adds the selected clustering into a resulting hierarchy. |
申请公布号 |
US9081854(B2) |
申请公布日期 |
2015.07.14 |
申请号 |
US201213543783 |
申请日期 |
2012.07.06 |
申请人 |
Hewlett-Packard Development Company, L.P. |
发明人 |
Ulanov Alexander;Sapozhnikov German;Shevlyakov Georgy |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
Hewlett-Packard Patent Department |
代理人 |
Hewlett-Packard Patent Department |
主权项 |
1. A method for processing information, the method comprising:
providing a plurality of documents under control of a processing device, each of the plurality of documents having a label; processing the labels related to the plurality of documents, using a clustering process, to cluster the labels into a plurality of clusterings representing a plurality of classes; classifying the documents using the clusterings representing the plurality of classes; predicting a plurality of performance characteristics, respectively, for the plurality of clusterings; wherein the predicting the plurality of performance characteristics provides a performance measure to predict how accurate the clustering for classification at a given layer of a hierarchy; selecting at least one of the clusterings using information from the performance characteristics; and adding the selected clustering into the hierarchy. |
地址 |
Houston TX US |