Harry Mayne

PhD researcher at the University of Oxford

LLM explainability & interpretability.
‍

Research

I'm a PhD researcher at the University of Oxford, where I work on language model explainability and interpretability. My current research explores whether models can reliably explain their outputs in natural language, a key requirement for effective human-computer interaction and potentially a major tool for monitoring to cognition of advanced AI. I also work on mechanistic interpretability problems, though it is less of a priority for me at the moment.

Alongside my main PhD research, I also work on LLM evals and science of evals. My publications include the LingOly reasoning benchmark (NeurIPS 2024 oral, top 0.5% papers) and LingOly-TOO. Beyond individual benchmarks, I’m interested in building more rigorous standards and ways to aggregate the results from many benchmarks. I'm currently involved in several projects aimed at advancing this goal.

I’m a member of the Reasoning with Machines Lab, and am supervised by Prof. Adam Mahdi (Oxford Internet Institute) and Prof. Jakob Foerster (Department of Engineering Sciences).
‍

Selected Publications

LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
‍A Bean, S Hellsten, H Mayne, J Magomere, E A Chi, R Chi, S A Hale, H R Kirk.
NeurIPS 2024 (Oral, top 0.5% papers)

Can Sparse Autoencoders be used to Decompose and Interpret Steering Vectors?
H Mayne, Y Yang, A Mahdi,
Interpretable AI: Past, Present and Future @ NeurIPS 2024

Toxic Neurons Aren’t Enough to Explain DPO: A Mechanistic Analysis for Toxicity Reduction
Y Yang, F Sondej, H Mayne, A Mahdi
SOLAR @ NeurIPS 2024

LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation
‍J Khouja, K Korgul, S Hellsten, L Yang, V Neacsu, H Mayne, R Kearns, A Bean, A Mahdi. March 2025.

Large language models can help boost food production, but be mindful of their risks.
D De Clercq, E Nehring, H Mayne, A Mahdi.
March 2024. Frontiers in Artificial Intelligence

Graph which doesn't tell you much unless you read the paper

Unsupervised learning approaches for identifying ICU patient subgroups: Do results generalise?
‍H Mayne, G Parsons, A Mahdi.
March 2024

About

Download
CV

I'm now in the second year of my PhD. I've had a bit of an unusual path to get to where I am today, having originally studied economics.

Education

Oxford Internet Institute, University of Oxford

DPhil Social Data Science
Researching language model explainability and interpretability

2023 - 2026

Oxford Internet Institute, University of Oxford

MSc Social Data Science
Distinction, 77%
Oxford Internet Institute Thesis Prize for best dissertation (88%)

2022 - 2023
‍

Selwyn College, University of Cambridge

BA Economics
Double First Class, top 10% of cohort
Awarded the Patrick Cross Prize for exceptional performance in the Economics Tripos

2019 - 2022

Grants

Grand Union DPT, Economic and Social Research Council

Full PhD Scholarship

2022 - 2026

Teaching

I hold multiple teaching positions including TA-ing part of the Social Data Science MSc at Oxford and tutoring Stanford computer science students. Each page contains details about courses and provides teaching materials where relevant.

I massively enjoy teaching. Working with the talented students at Oxford/Stanford is fantastic and I also firmly believe teaching is a great way for PhD students to sharpen their knowledge!

If you would like to get in touch about a teaching opportunity then please contact me via the form on this site. My previous students have received offers including the CS Masters at Stanford and various PhD positions at Oxford and other institutions.

If you are a current Stanford CS undergrad and are thinking about applying for the Stanford in Oxford program for ML-related tutorials then please get in touch!

Stanford University

Machine Learning

University of Oxford

Applied Analytical Statistics

Oxmedica / Mawhiba

AI and Big Data

University of Cambridge
Economics Interview Questions

Read More

Contact

If you would like to discuss collaborations, talks or teaching then please get in touch. I'm very open to discussing research ideas! You can also contact me through X or LinkedIn.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.