Overlay

2025 publications

Obscured but Not Erased: Evaluating Nationality Bias in LLMs via Name-Based Bias Benchmarks

In this work, we examine how large language models (LLMs) can exhibit latent biases towards specific nationalities even when explicit demographic markers are not present. We introduce a novel name-based benchmarking approach derived from the Bias Benchmark for QA (BBQ) dataset to investigate the impact of substituting explicit nationality labels with culturally indicative names, a scenario more reflective of real-world LLM applications. Our novel approach examines how this substitution affects both bias magnitude and accuracy across a spectrum of LLMs from industry leaders such as OpenAI, Google, and Anthropic. Our findings highlight the stubborn resilience of biases in LLMs, underscoring their profound implications for the development and deployment of AI systems in diverse, global contexts.

 

Read the paper here

Evaluating the Sensitivity of LLMs to Prior Context

As large language models (LLMs) are increasingly deployed in multi-turn dialogue and other sustained interactive scenarios, it is essential to understand how extended context affects their performance. Popular benchmarks, focusing primarily on single-turn question answering (QA) tasks, fail to capture the effects of multi-turn exchanges. To address this gap, we introduce a novel set of benchmarks that systematically vary the volume and nature of prior context. We evaluate multiple conventional LLMs, including GPT, Claude, and Gemini, across these benchmarks to measure their sensitivity to contextual variations. Our findings reveal that LLM performance on multiple-choice questions can degrade dramatically in multi-turn interactions, with performance drops as large as 73% for certain models. Even highly capable models such as GPT-4o exhibit up to a 32% decrease in accuracy. Notably, the relative performance of larger versus smaller models is not always predictable. Moreover, the strategic placement of the task description within the context can substantially mitigate performance drops, improving the accuracy by as much as a factor of 3.5. These findings underscore the need for robust strategies to design, evaluate, and mitigate context-related sensitivity in LLMs.

 

Read the paper here

AUTOSUMM: A Comprehensive Framework for LLM-Based Conversation Summarization

We present AUTOSUMM, a large language model (LLM)-based summarization system to generate accurate, privacy-compliant summaries of customer-advisor conversations. The system addresses challenges unique to this domain, including speaker attribution errors, hallucination risks, and short or low-information transcripts. Our architecture integrates dynamic transcript segmentation, thematic coverage tracking, and a domain specific multi-layered hallucination detection module that combines syntactic, semantic, and entailment-based checks.

Accepted to the Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Industry Track)

How Personality Traits Shape LLM Risk-Taking Behaviour

This research investigates the relationship between personality traits and risk-taking behaviour in Large Language Models (LLMs) using Cumulative Prospect Theory and the Big Five personality framework. The study reveals that most examined LLMs function as risk-neutral rational agents while exhibiting higher Conscientiousness and Agreeableness with lower Neuroticism. Interventions targeting Big Five traits, especially Openness, successfully influence risk-propensity in several models. Advanced LLMs demonstrate human-like personality-risk patterns through optimal prompting, while their distilled variants show limitations in cognitive bias transfer. The research identifies Openness as the most significant factor affecting risk-propensity, consistent with human baselines, and highlights both potential and limitations of personality-based interventions in LLM decision-making.

Accepted to the Findings of the Association for Computational Linguistics: ACL 2025

Application of GraphSAGE in Complex Transaction Networks

We present the practical application of GraphSAGE, an inductive Graph Neural Network framework, to non-bipartite heterogeneous transaction networks within a banking context. We construct a transaction network from anonymised customer and merchant transactions and train a GraphSAGE model to generate node embeddings. Our exploratory work on the embeddings reveals interpretable clusters aligned with geographic and demographic attributes. Additionally, we illustrate their utility in downstream classification tasks by applying them to a money mule detection model where using these embeddings improve the prioritisation of high-risk accounts. Beyond fraud detection, our work highlights GraphSAGE’s adaptability to banking-scale networks, emphasising its inductive capability, scalability, and interpretability. This study provides a blueprint for financial institutions to harness graph machine learning for actionable insights in transactional ecosystems.

Accepted to the Proceedings of The Workshop on Graph-based Representations in Pattern Recognition 2025 (GbR 2025)

2024 publications

A Brief Review of Quantum Machine Learning for Financial Services

This review paper examines state-of-the-art algorithms and techniques in quantum machine learning with potential applications in finance. We discuss QML techniques in supervised learning tasks, such as Quantum Variational Classifiers, Quantum Kernel Estimation, and Quantum Neural Networks (QNNs), along with quantum generative AI techniques like Quantum Transformers and Quantum Graph Neural Networks (QGNNs). The financial applications considered include risk management, credit scoring, fraud detection, and stock price prediction. We also provide an overview of the challenges, potential, and limitations of QML, both in these specific areas and more broadly across the field.

 

Read the paper here

2023 publications

Agent-based Modelling of Credit Card Promotions

In this research work, we develop an agent-based model of the UK credit card market, based on the interactions between lenders and customers. We then show how this model can be used as a tool to explore outcomes of zero-interest credit card promotion strategies under different market scenarios.

 

Read the paper here

Conformal Predictions for Longitudinal Data

In this paper, we present our research into uncertainty calculation for multi-dimensional time-series forecasting, setting a new performance benchmark for the field. These uncertainty estimates allow more informative predictions in areas like demand and stock market prediction and boost the trustworthiness of our forecasts.

 

Read the paper here

Modelling customer lifetime-value in the retail banking industry

This research improves the accuracy of customer lifetime value estimation in retail banking through a machine-learning framework. The improved insight from this framework allows accurate identification of high-value customers, informing marketing, relationship management, and business growth strategies.

 

Read the paper here

2022 publications

Offline Deep Reinforcement Learning for Dynamic Pricing of Consumer Credit

In this paper, we train an offline reinforcement learning agent with a static dataset to determine a better loan interest pricing policy. This approach can be applied to other pricing tasks and means risky experimentation in a live environment can be avoided.

 

Read the paper here 

An Introduction to Machine Unlearning

This paper presents a comprehensive review of a wide range of machine unlearning algorithms, which remove the influence of individual observations from models while minimising computational costs. This review standardizes definitions, evaluation methods, and tackles implementation challenges for machine unlearning.

 

Read the paper here

The material published on this page is for information purposes only and should not be regarded as providing any specific advice, or used by consumers to make financial decision. Terms and conditions apply to any products or services mentioned.

scroll to top