Welcome!

to the personal website of:

Matthew Farrugia-Roberts
First year doctoral student
Department of Computer Science & Magdalen College
University of Oxford

Nouns: Matthew, Matt, he/him, they/them (singular)—all fine.

Contact: ‘matthew’ at this domain.

Website perpetually under construction. This page includes my bio, announcements, research interests, publications, teaching, coursework, and affiliations.

About me

I am a student, researcher, and teacher from Melbourne, Australia. I’m currently in my first year of a DPhil at the University of Oxford, researching emergent goal-directedness under the supervision of Professor Alessandro Abate. I also collaborate on understanding goal misgeneralisation at Krueger AI Safety Lab and on developmental interpretability research at Timaeus.

Previously, I completed a Master of Computer Science degree at the University of Melbourne, with a thesis on lossless compression of neural networks, supervised by Daniel Murfet. During the degree I completed a virtual research internship at the Center for Human-compatible AI studying reward learning theory with Adam Gleave and Joar Skalse, and I helped run a virtual AI safety reading group at metauni. I also completed an exchange semester at ETH Zürich.

Before that, I worked as a tutor and lecturer at the University of Melbourne, teaching classes on programming, algorithmics, artificial intelligence, theoretical computer science, networks, and operating systems. Before that, I completed a Bachelor of Science at the University of Melbourne, where I majored in computer science and software engineering and took electives in mathematics, physics, and education.

Announcements

Research interests

Broad research interests:

Intelligence, learning, and computation (e.g., agent foundations, bounded/computational rationality, artificial intelligence, cognitive science)
Technology and society (e.g., existential risks from advanced intelligent systems, political philosophy, history and future of humanity)

So far, I’m still a student on these topics, with much to learn.

While I’m establishing myself as an academic, I have focussed on some narrower topics:

AI alignment (reward learning theory, goal misgeneralisation, developmental interpretability)
Deep learning theory (neural network geometry, singular learning theory)
Computer science education (discrete mathematics, theoretical computer science)

Publications by topic

Reward ambiguity and generalization in reinforcement learning:

Karim Abdel Sadek(=), Matthew Farrugia-Roberts(=), Usman Anwar, Hannah Erlebach, Christian Schroeder de Witt, David Krueger, and Michael Dennis, 2025, “Mitigating goal misgeneralization via minimax regret”. Conference paper to appear at RLC 2025.
Joar Skalse(=), Matthew Farrugia-Roberts(=), Alessandro Abate, Stuart Russell, and Adam Gleave, 2023, “Invariance in policy optimisation and partial identifiability in reward learning”. Conference paper (poster) presented at ICML 2023. Preprint on arXiv.

Science of deep learning, singular learning theory, developmental interpretability:

Simon Pepin Lehalleur(=), Jesse Hoogland(=), Matthew Farrugia-Roberts(=), Susan Wei, Alexander Gietelink Oldenziel, George Wang, Liam Carroll, and Daniel Murfet, 2025, “You are what you eat: AI alignment requires understanding how data shapes structure and generalisation,” Position paper under review. Preprint on arXiv.
Liam Carroll, Jesse Hoogland, Matthew Farrugia-Roberts, and Daniel Murfet, 2025, “Dynamics of transient structure in in-context linear regression transformers,” Conference paper under review. Preprint on arXiv.
Jesse Hoogland(=), George Wang(=), Matthew Farrugia-Roberts, Liam Carroll, Susan Wei, and Daniel Murfet, 2025, “Loss landscape degeneracy and stagewise development in transformers”, Journal paper to appear in TMLR. Preprint on arXiv.
George Wang(=), Matthew Farrugia-Roberts(=), Jesse Hoogland, Liam Carroll, Susan Wei, and Daniel Murfet, 2024, “Loss landscape geometry reveals stagewise development of transformers.” Workshop paper (poster, 7.3MB) presented at HiLD: 2nd Workshop on High-dimensional Learning Dynamics, a workshop at ICML 2024. Best papers of HiLD award.

Neural network geometry:

Matthew Farrugia-Roberts, 2024, “Proximity to losslessly compressible parameters”. Conference paper under review. Preprint on arXiv.
Matthew Farrugia-Roberts, 2024, “Losslessly compressible neural network parameters”. Workshop paper presented at Machine Learning and Compression Workshop, a workshop at NeurIPS 2024.
Matthew Farrugia-Roberts, 2023, “Functional equivalence and path connectivity of reducible hyperbolic tangent networks”. Conference paper (poster, 3.9MB) presented at NeurIPS 2023. Preprint on arXiv.
Matthew Farrugia-Roberts, 2022, Structural Degeneracy in Neural Networks, Master’s thesis, School of Computing and Information Systems, the University of Melbourne. Available online.

Computer science education:

Matthew Farrugia-Roberts, Bryn Jeffries, and Harald Søndergaard, 2022, “Teaching simple constructive proofs with Haskell programs”. Conference paper: extended abstract presented at TFPIE 2022, full paper published in EPTCS.
Matthew Farrugia-Roberts, Bryn Jeffries, and Harald Søndergaard, 2022, “Programming to learn: Logic and computation from a programming perspective”. Conference paper presented at ACM ITiCSE 2022.

Teaching

Teaching in 2024:

COMP90087 The Ethics of Artificial Intelligence (TA, guest lecture on AI safety)

Teaching in 2023:

COMP90087 The Ethics of Artificial Intelligence (TA)

Teaching in 2021:

COMP30024 Artificial Intelligence (co-Head TA)
COMP30026 Models of Computation (co-Head TA)
COMP90087 The Ethics of Artificial Intelligence (TA)

Teaching since 2016:

COMP90087 The Ethics of Artificial Intelligence (2021, 2023: TA)
COMP30026 Models of Computation (2016: TA, 2017–2020: Head TA, 2021: co-Head TA)
COMP30024 Artificial Intelligence (2017–2019: Head TA, 2020–2021: co-Head TA)
COMP90059 Introduction to Python Programming (2018: Lecturer and coordinator)
COMP20007 Design of Algorithms (2016: TA, 2017: Head TA, 2018: Coordinator)
COMP10001 Foundations of Computing (2017: TA)
COMP30023 Computer Systems (2017: TA)
COMP90038 Algorithms and Complexity (2016: TA)

Coursework

Master of Computer Science, University of Melbourne, part-time 2019–2022

Coursework in theoretical computer science and machine learning (coursework portfolio)
- Coursework average mark 98.75%
Minor thesis project on structural degeneracy in neural networks
- Thesis mark 95.5% (top of year)
Overall weighted average mark 96.25% (top of year)
Dean’s Honours List (top 5 percentile marks across Faculty of Engineering and Information Technology)

Exchange semester, ETH Zürich, 2020

Coursework on theoretical computer science, statistical learning theory, network modelling, and neuroscience (coursework portfolio)
- Grade-point average 5.92 / 6.00

Bachelor of Science, University of Melbourne, 2014–2016

Major in Computing and Software Systems (coursework in computer science and software engineering) plus electives in physics, mathematics, and education
- Average mark 93.04%
Dean’s Honours List (top percentile marks in Science faculty, all three years)
AAII Prize in Computer Science (top of class in AI subject—actually I achieved top-of-class marks fairly regularly but this time it came with an industry award)
ACS Student Award (best marks across third year computer science classes)

Affiliations

Current affiliations:

Doctoral student at the Department of Computer Science, University of Oxford.
Committee member, Oxford AI Safety Initiative.
Member of Magdalen College, University of Oxford.
AI researcher, AI Existential Safety Community, Future of Life Institute.

Past affiliations:

Research associate at Timaeus.
Research assistant at Krueger AI Safety Lab and the Computational and Biological Learning Lab, University of Cambridge.
Visiting research associate at the Melbourne Deep Learning Group.
Independent AI safety researcher, supported by Manifund grant “Introductory Resources for Singular Learning Theory”.
Research assistant at the School of Computing and Information Systems, the University of Melbourne.
Teaching assistant at the Centre for AI and Digital Ethics and the School of Computing and Information Systems, the University of Melbourne.
Master of Computer Science student at the Melbourne Deep Learning Group and the School of Computing and Information Systems, the University of Melbourne.
Virtual research intern at the Centre for Human-compatible AI, University of California, Berkeley.
Virtual research intern at the (then-named) Brain, Mind & Markets Laboratory, the University of Melbourne.
Casual tutor and lecturer at the School of Computing and Information Systems, the University of Melbourne.

Any views expressed on this website are not intended to represent the views of any of my current or previous affiliated institutions.