摘要 |
The inventive system detects exploits (e.g. malware, malicious software, computer viruses) by using probabilistic techniques to develop models of normal file types, A file may be modelled as a series of segments, each segment being a sequence of integer values. The structure governing each segment's contents is inferred by a machine learning system applying a Factorial Hidden Markov Model or a Hidden Markov Model. The system is trained using a corpus of normal files and a heuristic is used to construct predetermined models of normal file types. A received file is examined to determine whether or not it corresponds to any of the predetermined models of normal file types, e.g. by determining a probability that the received file is of a given file type. If the received file does not correspond to any predetermined models, it is flagged as a potential exploit. The system is applicable to intrusion detection, virus filtering or virus scanning. The system is able to detect previously unseen exploits and is not reliant on an exploit having a known signature.
|