主权项 |
1. An apparatus for extracting queries from webpages, the apparatus comprising:
a receiver configured to receive:
a plurality of queries, wherein each query, included in the plurality, is input into a distinct search box located on a distinct public-facing webpage;content associated with each public-facing webpage;identifying information associated with each inputter of each distinct query; a processor configured to:
analyze each query in the plurality of queries based on a first parameter, the first parameter being query length;discard a query, from the plurality of queries, said query which falls below a predetermined threshold with respect to fulfillment of the first parameter;analyze each query in the plurality of queries based on a second parameter, the second parameter being the magnitude of predetermined terminology included in each query's language, said predetermined terminology stored on a computer-readable memory;discard a query, from the plurality of queries, said query which falls below a predetermined threshold with respect to fulfillment of the second parameter;analyze each query in the plurality of queries based on a third parameter, the third parameter being a grammatical relationship of query terms to one another;discard a query, from the plurality of queries, said query which falls below a predetermined threshold with respect to fulfillment of the third parameter;analyze each query in the plurality of queries based on a fourth parameter, the fourth parameter being identifying information of an inputter of each query;discard a query, from the plurality of queries, said query which falls below a predetermined threshold with respect to fulfillment of the fourth parameter;analyze each query in the plurality of queries based on a fifth parameter, the fifth parameter being the content of the public-facing webpage associated with each query;discard a query, from the plurality of queries, said query which falls below a predetermined threshold with respect to fulfillment of the fifth parameter. |