to the personal website of:
Research Assistant (AI Alignment and Reward Hacking)
David Krueger’s AI Safety Lab
University of Cambridge
Nouns: Matthew, Matt, he/him, they/them (singular)—all fine.
Contact: ‘matthew’ at this domain.
Website perpetually under construction. This page includes my bio, announcements, research interests, publications, teaching, coursework, and affiliations.
I am a student, researcher, and teacher from Melbourne, Australia. I’m now at the University of Cambridge working on understanding goal misgeneralisation with Usman Anwar and David Krueger. I also collaborate on developmental interpretability research with Timaeus and the Melbourne Deep Learning Group, and I help run a virtual AI safety reading group at metauni.
Previously I completed a Master of Computer Science degree at the University of Melbourne, with a thesis on lossless compression of neural networks, supervised by Daniel Murfet. During the degree I completed a virtual research internship at the Center for Human-compatible AI, studying reward learning theory with Adam Gleave and Joar Skalse. I also completed an exchange semester at ETH Zürich.
Before that, I worked as a tutor and lecturer at the University of Melbourne, for classes on programming, algorithmics, artificial intelligence, theoretical computer science, networks, and operating systems. I also completed a Bachelor of Science (taking these classes and others) shortly beforehand.
- I’ll be at NeurIPS 2023 in New Orleans. Get in touch if you would like to meet with me at the conference.
- I am currently seeking opportunities for doctoral study in the UK or Europe. So far I have applied to the ETH AI Center, Oxford, and Cambridge, and I am planning to write a standalone post about my experience.
- While in Cambridge, I attended choral evensong at most of the colleges. I am planning to write a standalone blog about my experience.
- I attended the 2023 Developmental Interpretability Conference at Wytham Abbey, Oxford.
- I led a virtual workshop on using TPU virtual machines to accelerate machine learning research, for people without experience using a VM. A recording will be available soon.
- The first of two papers based on the results in my Master’s thesis was accepted for poster presentation at NeurIPS 2023: “Functional equivalence and path connectivity of reducible hyperbolic tangent networks”.
Broad research interests:
- Intelligence, learning, and computation (e.g., agent foundations, bounded/computational rationality, artificial intelligence, cognitive science)
- Technology and society (e.g., existential risks from advanced intelligent systems, political philosophy, history and future of humanity)
So far, I’m still a student on these topics, with much to learn.
While I’m establishing myself as an academic, I have focussed on some narrower topics:
- AI alignment (reward learning theory, goal misgeneralisation, developmental interpretability)
- Deep learning theory (neural network theory, singular learning theory)
- Computer science education (discrete mathematics, theoretical computer science)
Publications by topic
Neural network theory:
- Matthew Farrugia-Roberts, 2023, “Computational complexity of detecting proximity to losslessly compressible neural network parameters”. Conference paper under review. Preprint on arXiv.
- Matthew Farrugia-Roberts, 2023, “Functional equivalence and path connectivity of reducible hyperbolic tangent networks”. Conference paper to appear at NeurIPS 2023. Preprint on arXiv.
- Matthew Farrugia-Roberts, 2022, Structural Degeneracy in Neural Networks, Master’s thesis, School of Computing and Information Systems, the University of Melbourne. Available online.
Reward learning theory:
- Joar Skalse(=), Matthew Farrugia-Roberts(=), Alessandro Abate, Stuart Russell, and Adam Gleave, 2023, “Invariance in policy optimisation and partial identifiability in reward learning”. Conference paper (poster) presented at ICML 2023. Preprint on arXiv.
Computer science education:
- Matthew Farrugia-Roberts, Bryn Jeffries, and Harald Søndergaard, 2022, “Teaching simple constructive proofs with Haskell programs”. Conference paper: extended abstract presented at TFPIE 2022, full paper published in EPTCS.
- Matthew Farrugia-Roberts, Bryn Jeffries, and Harald Søndergaard, 2022, “Programming to learn: Logic and computation from a programming perspective”. Conference paper presented at ACM ITiCSE 2022.
Teaching in 2023:
- COMP90087 The Ethics of Artificial Intelligence (TA)
Teaching in 2021:
- COMP30024 Artificial Intelligence (co-Head TA)
- COMP30026 Models of Computation (co-Head TA)
- COMP90087 The Ethics of Artificial Intelligence (TA)
Teaching since 2016:
- COMP90087 The Ethics of Artificial Intelligence (2021, 2023: TA)
- COMP30026 Models of Computation (2016: TA, 2017–2020: Head TA, 2021: co-Head TA)
- COMP30024 Artificial Intelligence (2017–2019: Head TA, 2020–2021: co-Head TA)
- COMP90059 Introduction to Python Programming (2018: Lecturer and coordinator)
- COMP20007 Design of Algorithms (2016: TA, 2017: Head TA, 2018: Coordinator)
- COMP10001 Foundations of Computing (2017: TA)
- COMP30023 Computer Systems (2017: TA)
- COMP90038 Algorithms and Complexity (2016: TA)
Master of Computer Science, University of Melbourne, part-time 2019–2022
- Coursework in theoretical computer science and machine learning (coursework portfolio)
- Coursework average mark 98.75%
- Minor thesis project on structural degeneracy in
- Thesis mark 95.5% (top of year)
- Dean’s Honours List (top 5 percentile marks across Faculty of Engineering and Information Technology)
Exchange semester, ETH Zürich, 2020
- Coursework on theoretical computer science, statistical learning
theory, network modelling, and neuroscience (coursework portfolio)
- Grade-point average 5.92 / 6.00
Bachelor of Science, University of Melbourne, 2014–2016
- Major in Computing and Software Systems (coursework in computer
science and software engineering) plus electives in physics,
mathematics, and education
- Average mark 93.04%
- Dean’s Honours List (top percentile marks in Science faculty, all three years)
- AAII Prize in Computer Science (top of class in AI subject—actually I achieved top-of-class marks fairly regularly but this time it came with an industry award)
- ACS Student Award (best marks across third year computer science classes)
- Research assistant at David Krueger’s AI Safety Lab, University of Cambridge.
- Independent AI safety researcher, supported by Manifund grant “Introductory Resources for Singular Learning Theory”.
- Visiting research associate at the Melbourne Deep Learning Group.
- Research assistant at the School of Computing and Information Systems, the University of Melbourne.
- Teaching assistant at the Centre for AI and Digital Ethics and the School of Computing and Information Systems, the University of Melbourne.
- Master of Computer Science student at the Melbourne Deep Learning Group and the School of Computing and Information Systems, the University of Melbourne.
- Virtual research intern at the Centre for Human-compatible AI, University of California, Berkeley.
- Virtual research intern at the (then-named) Brain, Mind & Markets Laboratory, the University of Melbourne.
- Casual tutor and lecturer at the School of Computing and Information Systems, the University of Melbourne.
Any views expressed on this website are not intended to represent the views of any of my affiliated institutions.