Huawei offers personalized web search ranking by subject, incorporating user interests and semantic match
Although most web search engines don’t publish user statistics, the software marketing company Hub Spot estimates that industry leader Google now performs 5.6 billion web searches per day. The task of providing web search results is largely handled by ranking systems based on pre-trained language models that learn the semantic correspondence between query and document terms – an approach that ignores the advanced personalization signals such as user clicks.
In the new journal TPRM: A Custom Subject-Based Ranking Model for Web Search, a research team from Huawei Technologies’ Artificial Intelligence Applications Research Center is broadening the reach of ranking systems, delivering a custom subject-based ranking model (TPRM) that incorporates pre-trained terminology representations of language models with user profiles built by a subject model to fit a more relevant output ranking list.
The team summarizes its main contributions as follows:
- Integrate a subject model-based user profile with a pre-trained language model to produce a new custom ranking system, outperforming industry-leading ad hoc ranking models and custom ranking models on an AOL dataset real.
- Present the interpretability of thematic user profiles by providing a way to view user preferences when selecting documents under the given query.
- Disclose the effects of user interests and semantic correspondence learned from queries and documents, revealing their positive contributions to TPRM performance.
The proposed TPRM model architecture includes four modules: (1) User Interest Modeling, which uses a topic model based on documents clicked in the search history to model user interest; (2) Matching User-Doc interests via a kernel pooling approach; (3) the Query-Doc semantic match, which uses the large BERT language model to calculate the semantic match of a query and candidate documents; and (4) a custom ranking, which uses user-doc and query-doc match vectors to calculate a custom relevance score.
For their empirical study, the team compared TPRM with the BM25 algorithm and advanced ad hoc ranking models such as KNRM, Conv-KNRM, CEDR-KNRM, P-Click, SLTB, etc. Experiments were conducted on the real-world AOL research log and used the mean mean precision (MAP), mean reciprocal rank (MRR), P @ 1 (first position accuracy) and A.Clk (click position) average) as metrics to assess the quality of the generated ranking lists.
The results show that most custom ranking models will outperform ad-hoc ranking models, indicating the effectiveness of user profiles in improving the performance of ranking systems. In the experiments, TPRM significantly outperformed the TPRM-semantic model, verifying the benefits of user profiles constructed via the proposed thematic model approach.
The paper TPRM: A Custom Subject-Based Ranking Model for Web Search is on arXiv.
Author: Hecate Il | Editor: Michael Sarazen, Zhang Channel
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Weekly Synchronized Global AI to get weekly AI updates.