摘要 |
Architecture that employs a modeling technique based on language modeling to estimate a probability of a document matching the user need as expressed in the query. The modeling technique is based on the data mining results that various portions of a document (e.g., body, title, URL, anchor text, user queries) use different styles of human languages. Thus, the results based on a language can be adapted individually to match the language of query. Since the approach is based on adaptation, the framework also provides a natural means to progressively revise the model as user data are collected. Different styles of languages in a document can be recognized and adapted individually. Background language models are also employed that offer a fallback approach in case the document has incomplete fields of data, and can utilize topical or semantic hierarchy of the knowledge domain.
|