Method and arrangement for data clustering,申请号US201213658031-传众专利搜索

发明名称	Method and arrangement for data clustering
摘要	This disclosure relates a method within a network node and a corresponding network node for determining input information for creation of a data traffic clustering model. The method comprises obtaining data descriptors of data flows, wherein the data descriptors describe data of the data flows, and obtaining flow information of the data flows. In addition, the method comprises determining clustering model input information based on the obtained data descriptors and the obtained flow information. One exemplary advantage of the present disclosure is that it allows traffic clustering based solely on packet header information, making the present disclosure appropriate for handling encrypted traffic.
申请公布号	US9124528(B2)	申请公布日期	2015.09.01
申请号	US201213658031	申请日期	2012.10.23
申请人	Telefonaktiebolaget L M Ericsson (publ)	发明人	Szabo Geza;Pongrácz Gergely;Turányi Zoltán Richárd
分类号	H04L12/24;H04L12/26;H04L12/715;H04L12/851	主分类号	H04L12/24
代理机构	Coats & Bennett, PLLC	代理人	Coats & Bennett, PLLC
主权项	1. A method in a network node for determining input information for creation of a data traffic clustering model, wherein data traffic via the network node comprises a plurality of user data flows of known data categories, the method comprising: obtaining data descriptors of the data flows, wherein the data descriptors describe physical parameters of the data flows; obtaining flow information of the data flows; and determining clustering model input information based on the obtained data descriptors and the obtained flow information, wherein the determining comprises at least one of: determining constraints on data samples from data flows with the same categories, wherein the constraints are determined on data samples of at least one of: data flows that originate from different source IP addresses and are destined for a same destination IP address;data flows that originate from a same source IP address and a same source port of the IP address; anddata flows with different flow information but with a same source IP address; andselecting a subset of the data descriptors by: calculating values of required bandwidths of various processing resources for calculation of the data descriptors, and calculating values of content information of a respective data descriptor of each data flow;comparing the values of required bandwidths and the values of content information with bandwidth and content information thresholds, respectively; andselecting the subset of the data descriptors based on the comparing; and transmitting data indicative of the clustering model input information to another network node for creation of a data traffic clustering model based on the clustering model input information.
地址	Stockholm SE