发明名称 Keyword Frequency Analysis System
摘要 According to embodiments of the present disclosure, a keyword frequency analysis system stores a plurality of sets of records. Each set of records may be associated with a dimension and may comprise a first keyword and a second keyword. The system may also receive the plurality of sets of records, determine a frequency of the first keyword in each set of records and determine a frequency of the second keyword in each set of records. The system may further determine an expected frequency of the first keyword in a first set of records associated with a first dimension, based on the frequency of the first keyword and the frequency of the second keyword. The system also compares the frequency of the first keyword and the expected frequency and, based on the comparison, determines whether the first keyword is either overrepresented or underrepresented in the first set of records.
申请公布号 US2016154797(A1) 申请公布日期 2016.06.02
申请号 US201414557067 申请日期 2014.12.01
申请人 Bank of America Corporation 发明人 Kern Daniel C.;Maher Pasha M.;Sun Adam Z.
分类号 G06F17/30 主分类号 G06F17/30
代理机构 代理人
主权项 1. A keyword frequency analysis system, comprising: a memory operable to store a plurality of sets of records, wherein each set of records is associated with a dimension and comprises a first keyword and a second keyword; an interface operable to: receive the plurality of sets of records;receive a request to determine whether the first keyword is a selected one of overrepresented or underrepresented in a first set of records, the request comprising a selection of a method to calculate an expected frequency of the first keyword; one or more hardware processors communicatively coupled to the interface and the memory and operable to: determine a frequency of the first keyword in each set of records;determine a frequency of the second keyword in each set of records;determine the method to calculate the expected frequency of the first keyword based on the selection of the method in the request to determine whether the first keyword is a selected one of overrepresented or underrepresented in the first set of records;calculate the expected frequency of the first keyword in the first set of records associated with a first dimension using the method, the expected frequency of the first keyword being a number of times the first keyword should appear in the first set of records, the expected frequency of the first keyword based on the frequency of the first keyword and the frequency of the second keyword;determine a difference between the frequency of the first keyword and the expected frequency;compare the difference to a threshold, the threshold indicating whether the difference is large enough to determine one of a selected group of overrepresentation or underrepresentation;in response to determining that the difference is not greater than the first threshold, communicate a message indicating that the first keyword is not overrepresented and not underrepresented;in response to determining that the difference is greater than the first threshold: determine whether the frequency of the first keyword is less than the expected frequency;in response to determining that the frequency of the first keyword is less than the expected frequency: determine that the first keyword is underrepresented in the first set of records;determine a degree of underrepresentation by comparing the threshold and the difference between the frequency of the first keyword and the expected frequency;translate the frequency of the first keyword, the frequency of the second keyword, the degree of underrepresentation, and the expected frequency into the keyword report, the keyword report comprising the expected frequency, the degree of underrepresentation, and the determination that the first keyword is underrepresented in the first set of records; andcommunicate the keyword report for display.
地址 Charlotte NC US