发明名称 INFORMATION IDENTIFICATION METHOD, PROGRAM PRODUCT, AND SYSTEM
摘要 In a case where supervised (learning) data is prepared and the case where test data is prepared, the data is recorded with time information attached to the data. The method includes clustering the learning data in a target class and clustering the test data in the target class. Then, the probability density for each of identified subclasses is calculated for each of time intervals having various time points and widths for the learning data, and is calculated for each of time intervals in the latest time period which have various widths, for the test data. Then, a ratio between a probability density obtained when learning is performed and a probability density obtained when testing is performed is obtained as a relative frequency in each of the time intervals for each of the subclasses. Input having a relative frequency that statistically and markedly increases is detected as an anomaly.
申请公布号 US2014180980(A1) 申请公布日期 2014.06.26
申请号 US201214234747 申请日期 2012.04.26
申请人 Hido Shohei;Tatsubori Michiaki 发明人 Hido Shohei;Tatsubori Michiaki
分类号 G06N99/00;G06F21/55 主分类号 G06N99/00
代理机构 代理人
主权项 1. A computer implemented information identification method for detecting an attack carried out using irregular data against a classifier that is configured by means of supervised machine learning, the method comprising the steps of: preparing a plurality of pieces of training data each including feature data, a label, and time; configuring the classifier by using the plurality of pieces of training data; configuring a sub-classifier by using the plurality of pieces of training data while classifying data in classes obtained through classification by the classifier, into subclasses; preparing a plurality of pieces of test data each including feature data, a label, and time; classifying the plurality of pieces of test data by using the classifier; classifying the plurality of pieces of test data that have been classified, into subclasses by using the sub-classifier; calculating statistical data representing a relative frequency of the plurality of pieces of test data with respect to the plurality of pieces of training data, the statistical data being calculated for each identical set of the subclasses in a time window having a predetermined width for the time; and warning of a possibility of occurrence of the attack carried out using the irregular data, in response to a value of the statistical data exceeding a predetermined threshold.
地址 Kanagawa-ken JP