发明名称 |
System and method for ranking and selecting data features |
摘要 |
Example systems and methods of extracting the most informative data parameters from a set of data are provided. Large dimensionality data sets may reduced to a desired dimensionality while substantially preserving their real world interpretation so that the resultant reduced dimensionality set may still be effectively interpreted in light of a real world initial data set. The systems and method first complete the data set by filling in missing data in a manner that will not bias the resultant reduced data set. The system then selects the N most informative data parameters while minimizing reconstruction error. |
申请公布号 |
US9348885(B2) |
申请公布日期 |
2016.05.24 |
申请号 |
US201414172607 |
申请日期 |
2014.02.04 |
申请人 |
ADOBE SYSTEMS INCORPORATED |
发明人 |
Modarresi Kourosh |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
Shook, Hardy & Bacon L.L.P. |
代理人 |
Shook, Hardy & Bacon L.L.P. |
主权项 |
1. A method of selecting a desired number of informative data parameters from among a set of data parameters, the method comprising:
obtaining, from a database with a processor, a set of data comprising a plurality of parameters representing a plurality of website metric variables; identifying a desired number of informative parameters to be selected from the plurality of parameters; creating, by the processor, a complete data set by filling in any missing values in the set of data using a method that does not substantially bias the statistics of the set of data; selecting, by the processor, a next most informative parameter with a highest variation while having a lowest correlation to a set of previously selected parameters by comparing variation in a non-selected parameter with variation of other non-selected parameters and by evaluating correlation of the non-selected parameter with the set of previously selected parameters; adding the selected next most informative parameter to the set of previously selected parameters; and repeating the selecting and adding operations until the desired number of informative parameters have been selected. |
地址 |
San Jose CA US |