摘要 |
A method for detecting malicious HTTP redirections. The method includes obtaining, based on a single client IP address, HTTP flows triggered by visiting a website, extracting a sequence of URLs where a downstream URL is extracted from a child HTTP request that is triggered by a parent HTTP request containing an immediate upstream URL, analyzing the URL sequence to generate a statistical feature, and classifying, based on the statistical feature, the HTTP flows as containing at least one malicious HTTP redirection triggered by visiting the website. |
主权项 |
1. A method for detecting malicious HTTP redirections in a network, comprising:
obtaining, from the network and based on a single client IP address, one or more HTTP flows triggered by a client device visiting a website, wherein the one or more HTTP flows comprises a first sequence of HTTP request/response pairs, the first sequence of HTTP request/response pairs including a first sequence of universal resource locators (URLs); constructing a per-user tree using the one more HTTP flows, the per-user tree including nodes corresponding to URLs, including the first sequence of URLs, wherein the per-user tree includes an edge from a parent node to a child node if a request for a URL corresponding to the child node is triggered from the URL corresponding to the parent node, wherein each edge of the per-user tree is annotated by: 1) a URL type assigned to the URL corresponding to the child node and 2) a time that elapses between HTTP requests in the parent node and child node, wherein the per-user tree includes multiple paths, the multiple paths corresponding to both benign requests and malicious paths; extracting, from the first sequence and using a first pre-determined algorithm, a second sequence of URLs comprising an upstream URL and a downstream URL adjacent to each other in the second sequence, wherein the downstream URL is extracted from a child HTTP request that is subsequent to a parent HTTP request comprising the upstream URL, wherein extracting the second sequence of URLs comprises:
selecting, from the first sequence, the parent HTTP request and the child HTTP request that are generated by the client device;selecting, from the first sequence, a parent HTTP response received by the client device, wherein the parent HTTP response is generated by a server device identified by the upstream URL;detecting that the parent HTTP response comprises the downstream URL, wherein the child HTTP request is generated by the client device based on the parent HTTP response; andincluding, in response to the detecting, the upstream URL and the downstream URL in the second sequence of URLs; updating the per-user tree to include paths corresponding to the extracted second sequence of URLs; analyzing, by a processor of a computer system using a second pre-determined algorithm, the second sequence of URLs to generate a statistical feature of URLs based at least on the upstream URL and the downstream URL, the statistical feature being stored in a statistical feature vector; and classifying, based on the statistical feature of URLs, the first sequence of HTTP request/response pairs as comprising at least one malicious HTTP redirection triggered by visiting the website, wherein classifying includes updating the per-user tree to reflect that the path on the per-user tree corresponding to the at least one malicious HTTP redirection is malicious. |