发明名称 Determining string similarity using syntactic edit distance
摘要 Examples relate to determining string similarity using syntactic edit distance. In one example, a computing device may: receive domain name system (DNS) packets that were sent by a client device, each DNS packet specifying a domain name; generate, for each domain name, a syntax string by replacing each character of the domain name with one of a plurality of metacharacters, each metacharacter representing a category of characters that is different from each other category of characters represented by each other metacharacter; determine, for each domain name, a syntactic edit distance between the domain name and each other domain name, the syntactic edit distance between domain names being determined based on syntax strings of the corresponding domain names; cluster each domain name into one of a plurality of clusters based on the syntactic edit distances; and identify the client device as a potential source of malicious software based on the clusters.
申请公布号 US9479524(B1) 申请公布日期 2016.10.25
申请号 US201514679757 申请日期 2015.04.06
申请人 Trend Micro Incorporated 发明人 Hagen Josiah
分类号 H04L29/06;H04L29/12;G06F11/00;G06F12/14;G06F12/16;G08B23/00 主分类号 H04L29/06
代理机构 Okamoto & Benedicto LLP 代理人 Okamoto & Benedicto LLP
主权项 1. A non-transitory machine-readable storage medium encoded with instructions executable by a hardware processor of a computing device for determining string similarity, the machine-readable storage medium comprising instructions to cause the hardware processor to: receive domain name system (DNS) query packets that were sent by a particular client computing device, each DNS query packet specifying a query domain name; generate, for each query domain name included in the received DNS query packets, a syntax string by replacing each character of the query domain name with one of a plurality of metacharacters, each of the plurality of metacharacters representing a category of characters that is different from each other category of characters represented by each other metacharacter in the plurality of metacharacters; determine, for each query domain name included in the received DNS query packets, a syntactic edit distance between the query domain name and each other query domain name included in the received DNS packets, the syntactic edit distance between query domain names being determined based on syntax strings of the corresponding domain names; cluster each query domain name included in the received DNS query packets into one of a plurality of clusters based on the syntactic edit distances; and identify the particular client computing device as a potential source of malicious software based on the plurality of clusters.
地址 Tokyo JP