Optimising News Recommenders Beyond Accuracy

A state-of-the-art neural news recommender designed to go beyond accuracy by promoting both relevance and diversity to reduce filter bubbles. Built as the final MSc project, it integrates cutting-edge architectural innovations and evaluation frameworks drawn from both Information Retrieval and NLP research. The work reflects extensive academic grounding — synthesising insights from over 40 highly-cited papers in neural recommenders, beyond-accuracy metrics, and news recommendation systems.

Project Outline

This project develops a bi-encoder neural news recommender, leveraging multi-head self-attention and trained on the MIND dataset. The model incorporates differentiable loss terms for Intra-List Diversity (ILD) and Top‑L Surprisal, enabling it to recommend content that is not only personally relevant, but also diverse and novel — addressing well-documented concerns regarding filter bubbles and echo chambers in personalised news feeds.

The approach is rooted in a literature review of the state-of-the-art in:

Neural recommender architectures (Transformers, cross-, and bi-encoder designs),
Beyond-accuracy metrics from Information Retrieval (including diversity, novelty, coverage, serendipity), and
Advanced NLP techniques for news understanding and user profile construction.

The project involved an intensive 6-week sprint (~60 hrs/week), and has been recognised for its academic quality — with the supervising professor and external examiner encouraging submission as a paper to an academic journal or conference in the field (an invitation extended to a select few top projects).

Key Features

Attention-based bi-encoder architecture implemented in Keras 3
Regularisation terms optimising for diversity (ILD) and novelty (Surprisal) in the loss objective
Custom evaluation framework, combining standard ranking metrics with beyond-accuracy objectives
Fully reproducible experiments with robust pipeline design
Grounded in current academic research and supported by an extensive reference base (~40 papers)

Publication

📄 Read report (ResearchGate)

💻 Read code (GitHub)

Media

A five-minute video presenting the goals, methods, and outcomes of the project: