发明名称 Methods and systems for analyzing data related to possible online fraud
摘要 Various embodiments of the invention provide methods, systems and software for analyzing data. In particular embodiments, for example, a set of data about a web site may be analyzed to determine whether the web site is likely to be illegitimate (e.g., to be involved in a fraudulent scheme, such as a phishing scheme, the sale of gray market goods, etc.). In an exemplary embodiment, a set of data may be divided into a plurality of components (each of which, in some cases, may be considered a separate data set). Merely by way of example, a set of data may comprise data gathered from a plurality of data sources, and/or each component may comprise data gathered from one of the plurality of data source. As another example, a set of data may comprise a document with a plurality of sections, and each component may comprise one of the plurality of sections. Those skilled in the art will appreciate that the analysis of another component may comprise certain tests and/or evaluations, and that the analysis of another component may comprise different tests and/or evaluations. In other cases, the analysis of each component may comprise similar tests and/or evaluations. The variety of tests and/or evaluations generally will be implementation specific.
申请公布号 US9356947(B2) 申请公布日期 2016.05.31
申请号 US201514680918 申请日期 2015.04.07
申请人 THOMSON REUTERS GLOBAL RESOURCES 发明人 Shraim Ihab;Shull Mark
分类号 G06F7/00;G06F17/00;G06F11/00;G06F12/14;G06F7/04;H04L29/06;H04L12/58 主分类号 G06F7/00
代理机构 Faegre Baker Daniels LLP 代理人 Faegre Baker Daniels LLP
主权项 1. A method, comprising: periodically collecting, with a computer, from a plurality of different sources, a set of data related to a web site, wherein the set of data comprises a web page on the web site; dividing, with the computer, the set of data into a plurality of components, the plurality of components including at least an Internet Protocol (“IP”) address associated with the web site and a body field comprising text; analyzing at least two of the components, wherein analyzing the at least two of the plurality of components comprises: analyzing the text of the body field to identify at least one of a pre-defined blacklisted term and a brand name;identifying a domain of the web site;identifying an Internet Protocol (“IP”) block assigned to the domain; andcomparing the IP address of the web site with the IP block assigned to the domain; assigning at least one score to one or more of the analyzed components; and categorizing the web site as a possibly fraudulent web site, based at least in part on the at least one score.
地址 Baar CH