发明名称 |
Visualizing Large Data Volumes Utilizing Initial Sampling and Multi-Stage Calculations |
摘要 |
Embodiments visualize large data volumes utilizing initial sampling to reduce size of a dataset. This sampling may be random in nature. The sampled dataset may be refined (wrangled) by binning, grouping, cleansing, and/or other techniques to produce a wrangled sample dataset. A user defines useful end visualization(s) by inputting expected dimension/measures. From these visualizations of sampled data, minimal grouping sets are deduced for application to the full dataset. The user publishes/schedules the wrangled operation and grouping sets definition. Based on this, a wrangled dataset and grouping sets are produced in the big data layer. When the user accesses the visualization(s), minimal grouping sets are retrieved in the in-memory engine of the client and processed by an in-memory database engine according to the common processing plan. This produces result sets and a final set of visualizations of the full dataset, in which the user can recognize valuable data trends and/or relationships. |
申请公布号 |
US2016179852(A1) |
申请公布日期 |
2016.06.23 |
申请号 |
US201414575633 |
申请日期 |
2014.12.18 |
申请人 |
Naibo Alexis;Xu Xiaohui;Le Biannic Yann |
发明人 |
Naibo Alexis;Xu Xiaohui;Le Biannic Yann |
分类号 |
G06F17/30 |
主分类号 |
G06F17/30 |
代理机构 |
|
代理人 |
|
主权项 |
1. A computer-implemented method comprising:
a first engine of an interface layer communicating with a separate layer comprising a large volume of stored data, to receive a first dataset representing a sample of the large volume of stored data; the first engine creating from the first dataset, a multi-stage calculation plan configured to receive a minimal grouping set as input; a second engine executing an operation on the first dataset according to the calculation plan to produce a first result set; the second engine receiving from the separate database layer, a second dataset comprising the minimal grouping set; the second engine performing the operation on the second dataset according to the calculation plan to produce a second result set; and the first engine creating visualization from the second result set. |
地址 |
Levallois Perret FR |