Information for Prospective PhD Students

About our new lab

Our department (Computer Science and Technology) has made some significant investments in ML recently, like hiring 4 new faculty over the last two years (including myself). Thus we are creating a brand new Machine Learning group jointly with two other members of staff, Neil Lawrence and Carl Henrik Ek. We call it ML@CL where CL stands for Computer Lab, the old (and much cooler) name of our department.
We plan to create a friendly and collaborative research environment. Students would have a supervisor (one of the three of us) but generally everyone should feel like they belong to the same large group, and we won't silo people into different topics based on who their supervisor is.
Our website is not quite ready yet so here's a group photo I took today :D

As you can see from the photo above, right now we're a small-ish group but we'll have new students joining this October, and we're planning to further expand in the years beyond that. I personally expect to make up to 4 offers to students to work with me, and Ideally I would like to end up with 2-3 students.
Joining a new lab, and a new supervisor has pros and cons I'm happy to talk about. On the plus side, initially I'll have more time for each student, because I have fewer students and other duties. Likewise, as a new student, you would contribute more to our group and culture, it's going to be a lot of fun - I hope. Also, the team would gradually grow around you, and you would grow with the team: by the time you graduate, you would have mentored several cohorts of new students.
The lab sits within the amazing ML research environment of Cambridge. Over at the Engineering Department there's an established ML group where I did my PhD. Together with them, we formed an ELLIS Unit which means we're part of a growing network of European ML research labs. ML research takes place elsewhere in the University, and you'll find some of the best scientists in almost any discipline to collaborate with. In addition to this, we also have Microsoft Research, Amazon and Samsung in town, who have ML research teams. Plus, we're not far from London.

Our interests:

Here are some keywords about what the whole groups' interests look like:

Neil: Gaussian processes, latent variable models, machine learning for systems, Bayesian inference, AI policy and society. Among other things he is involved in Data Science Africa, [Gaussian Process Summer Schools](http://gpss.cc/), and coordinates the Accelerate Program for Scientific Discovery
Carl Henrik: Gaussian processes, variational methods, latent variable models, approximate inference. See his google scholar page.
Ferenc: not Gaussian processes :D, principled deep learning, meta-learning, data-efficiency, self-supervised learning, optimisation and generalisation in deep learning, variational methods, information theory, causal inference

A bit of background about me

My name is Ferenc, pronounced roughly as "fair-ants" - if you email or address me, please feel free to use my first name.
I did my PhD ~10 years ago at the Engineering Department in Cambridge. I worked on Bayesian inference, variational methods, Gaussian processes, nonparametric Bayesian methods, kernel methods. My PhD was before the deep learning hype, and we didn't really care about neural networks back then.
After my PhD I worked in industry as a data scientist in a startup, then at a venture capital firm - this is where I picked up a lot of practical tools, and I eventually became interested in deep learning.
some of the most exciting period in my life was working with Magic Pony Technology (MPT) team around 2016. It was a startup company which developed new deep learning methods for image and video processing, superresolution and lossy image compression. This was a pretty amazing application area as it inspired some super cool work on Generative Adversarial Networks, Variational Autoencoders, and Convolutional Neural Nets.
Twitter acquired MPT about four years ago, and that's where I spent the time since. I worked on a variety of applied ML projects: computer vision, then recommender systems, we started a new team on fairness, accountability and transparency, and most recently I lead a research team on data-efficient deep learning. I worked with many awesome colleagues at Twitter:

I also have a research blog inFERENCe that you may know. I'm hoping to contribute content there in the future when I have a bit more time to do so.
I hold an honorary position at the Gatsby Unit, where I collaborate some great researchers like Arthur Gretton, Maneesh Sahani and their students.
I recently joined the Department of Computer Science and Technology in Cambridge, and I'm inviting PhD applications starting October 2021

My interests and limitations

My interests are between theoretical and practical work, in particular at the intersection of probabilistic methods and deep learning.

I may not be your ideal supervisor if you:

expect or want to work on very theoretical/Maths-y things such as regret bounds, convergence bounds, like stuff usually published at COLT or ALT. I understand (some) and appreciate that research, but we're probably not the right environment to support this work. Similarly, I have my limits when it comes to overly mathematical topics around stochastic processes or kernels. I'm fine with both, but if you really fundamentally need to care about Banach spaces or Polish spaces for your thing, than I'm probably not an ideal supervisor. I expect most of our publications will be at ICLR, ICML and NeurIPS, maybe AISTATS, UAI, and JMLR.
want to work on Gaussian processes (with he exception of NTK-related theory), Bayesian nonparametrics, kernel methods and similar topics. We can still have a chat if these are your interests, but there are many excellent supervisors in Cambridge who focus on these areas both in our group and beyond. I would like to primarily focus on deep learning these days, mostly to upset people around me who don't like deep learning as much :)
are interested in working mainly on applications of deep learning, or innovating in deep learning mostly at the level of architectures or training hacks. I like top-down work: starting from principles and goals, and deriving algorithms from there. I tend to not get as excited about and am not very good at bottom-up work: throwing things together or coming up with whacky ideas. I don't expect my group to come up with papers like DCGAN, dropout or batchnorm or mixup, but I do hope we would produce the likes of f-GAN, Wasserstein GAN or VIB, just to mention examples.
want to specialise on computer vision or NLP. While I imagine I'd be collaborating with researchers in that space, and have an odd paper that is specific to CV or NLP, this would not be my main interest. I do not expect to publish much at ACL, EMNLP or CVPR, KDD, RecSys or similar conferences (even though you'll find that I have published at these conferences when I worked in industry).

Themes

I will try to organise our work around four long-term durable themes roughly:

goals and principles representation learning: How can we formulate the goal of unsupervised representation learning? How can we turn those goals into principles and practical loss functions? How can we understand what existing representation learning methods really do? You'll find I've written lost of posts along this genre: like this, this, this or this
optimization and generalization: I'm intrigued by all these relatively new insights into why deep learning methods are so successful, and how it is that they generalize so well, when in fact our old intuition tells us they should overfit and fail miserably. Again, you'll find I like to write about this topic, too: this, this or this. I'm also very interested in understanding the utility and behaviour of second-order optimization, especially natural gradients.
causal inference: I've recently (well, it's over two years now) got into causal inference, it's been a blind spot for me before. I now recognise how important is, and I expect to be doing some work on causality. I'm interested both in how deep learning (and associated techniques) can help us solve difficult causal inference problems, and how causal inference can help us make methods more robust. I've written a post on this which turned into a causal inference tutorial series (part 1, part2, etc), and I'm particularly intrigued with the technique called Invariant Risk Minimization
probabilistic foundations: This is going back to my roots as a Bayesian inference researcher. I'm less of a religious Bayesian, but I think there are many aspects of Bayesian theory that can be useful for deep learning. I like to think about exchangeability, have like of work based on this paper which in turn is based on an even older post. I'm also very interested in recent work on cold posteriors as well as generalizations of Bayesian inference. I'm also keen to recast various methods for continual learning, meta-learning, or curiosity-based exploration in a probabilistic/information theoretic light. see e.g. this or this
I hope that students will be equally excited about at least one of these themes, but I wouldn't be surprised if eventually they would develop an interest in more/all of them, and collaborate with other students on some fantastic papers.
I'm open to be challenged: if you're smart and motivated, and very excited to work on an area not listed here(e.g. something with reinforcement learning or domain adaptation, or privacy-preserving machine learning, fairness or whatever), I'm more than happy to learn more about the topic and support your research if it's within my capabilities to do that.

Expectations for successful applicants

I expect to have many highly qualified people applying, so without discouraging you, I would like to give a bit of background about what I expect successful candidates to have:

solid python knowledge, and familiarity with open-source frameworks for automatic differentiation, like pytorch, TF or JAX. Demonstrated experience of completing a project with python gives us the best signals. If you have something on github we can look at, that's usually very good. We may ask you to give us samples of your code or may ask some questions about coding to gauge this. Why we need this: I expect students to share code, and it's important that we all speak the same language so we can contribute to a common codebase and joint projects. I recommend that you invest the time to learn python if you can before starting a PhD anyway, there are tons of good resources online.
some past experience training deep learning models. While this is non-essential, it can be learned of course, it's super useful if you can demonstrate that you have successfully solved problems with deep learning before. If you have something on github or a writeup, that helps.
enough mathematical background (linear algebra, probability) to understand research papers, or lacking that, strong theoretical background in physics, control theory or some other relevant field that suggests that you will be quick to pick things up. I expect candidates we'll make offer to be familiar with concepts like KL-divergence, Bayes' theorem, matrix eigenvalues, Hessian. If you can derive something like the evidence lower bound or the EM algorithm you're probably in a good place. When we talk to you, we might ask some questions to gauge your familiarity with these concepts.
papers? No, you don't need them. There's a lot of discussion about candidates having to have published papers before they apply for a PhD. This is not a requirement. There is however a tendency for candidates nowadays to have published or at least submitted papers under their belt when they apply. But it's often an indication that they come from a group where they had access to other researchers. Context matters a lot. A poor paper is worse than no paper. I will try to be as fair as possible, you do not have to have published ML papers or stuff on arXiv to apply. If you do have samples of essays you've written, we may ask to have a look at those.
participated in summer schools, online courses, internships. These are a plus, but not a requirement.

Application process

Here is the department's page explaining the application process and relevant deadlines.
In Cambridge, there is a separate competition for funding. Applicants selected by supervisors are put forward to compete for various scholarships. You should check out the list referenced from the page above, as well as individual colleges who offer scholarships with different qualifying criteria. Unfortunately, this funding process can be a bit confusing, but we hope to be able to fund everyone we would like to admit. I should point out there are relevant studentships from DeepMind targeting women and members of underrepresented groups. Details on how to apply for these are not yet available.
It's a good thing if you reached out - thanks. It's a good idea to have an informal chat with supervisors before you send in a formal application to see if there's fit, and to start building a relationship and awareness if there is one. This will help the supervisor by giving them more signals to prioritise applicants. Expect us to ask some technical questions in these discussions to gauge your expertise and knowledge in ML. If you've done some project already, it's likely we'll just ask you about that project and adjacent areas.
I will have initial calls with several students who reached out (probably not with all of them as I just don't have the time). I'll try to get back to everyone about whether I feel there is a match, and if I think you will be successful in the application process based on your profile. After this, I might encourage you to send in an application. If there's a mismatch, I'm happy to introduce you to other supervisors or labs who I think would be better.
We haven't yet decided how we're going to organise the full interview process for shortlisted candidates, but I personally would like you to spend some time with other members of the group, so you get a feel for who we are, and if you like the environment.

What's in your application

Recommendation letters, and research statements are a crucial part of the process, and are especially important for securing funding. It's a bit unfair, because many countries don't have a culture around academic recommendation letters, so some supervisors might simply don't know how to write a good recommendation letter. If you think you are in this situation, or if you worry about writing your application, I'll be happy to help you by pointing you to resources which explains to your supervisor/referee what and how they should write. It may be a good idea to try and look for a mentor to help you with the practical aspects application.
If you have papers or relevant essays or project reports you've written, which you would like to show us, these can give us very useful signal as well, and you have an option to attach these to your application.
As a rule of thumb, things don't have to be too long. Imagine how many of these we have to read through. Perhaps the most important thing is to think about how you structure the documents. Clear section headings, perhaps in some cases bullet points are useful. Simple sentences that give signal. Again, if you're worried, I recommend you try to look for mentors.
Note the formal requirement to have language tests unless English is your native language or you have a Degree taught in English. I expect that the COVID-19 situation might have made it a bit harder to do an IELTS test, so just be conscious of this and be aware of how long it would take.
We look for candidates who are technically qualified, and who have opinions about what they want to work on and why. Your material will have to communicate these two things.

I hope this helps. My email is fh277@cam.ac.uk

inFERENCe

My lab

Blog