My research interests are mainly centered around statistical and computational aspects (and the interplay of the two) of modern-day machine learning techniques. In particular, here are a few topics which I’m currently thinking about:
- (Spurious) local minima of empirical risk landscapes – When performing empirical risk minimization to find an estimate of a parameter of estimate, for non-convex problems we could potentially end up in a non-optimal local minima, which we find to be poor estimators (for example with respect to generalization). How can we guarantee that we avoid these sub-optimal minima points, either by showing they don’t exist at all, or our choice of algorithm avoids them?
- Optimization algorithms as implicit regularization – Particularly for the complex models we now find ourselves attempting to fit, we’re beginning to understand that our choice of optimization algorithm (such as SGD or Adam) will act as implicit regularization, and so what can we say statistically about the properties of the estimators we obtain, and how does the change in choice of algorithm have an affect?
- Adversarial attacks on machine learning algorithms, links to interprebility and generalization – There are now a lot of examples of (both white box and black box) adversarial attacks on various types of neural networks, for which we are not fooled but our machine learning architecture is. For example, when using convolution neural networks for image classification, we can carefully generate noise to be added to images which change the classification results (potentially in a severe way), but to you and me retain their original meaning. How can we explain this lack of robustness, and what can we do about it?
From an applied perspective, I’m interested in the applications of machine learning methods to medical and epidemiological problems. The use of these methods to help improve people’s lives – for example, in early detection of heart disease and cancer and epidemic tracking – is only beginning, and there is great potential for us to do more.
Statistics is only useful provided we apply it – moreover these applications help give us theoretical insights – and so I’m generally interested in hearing about how these techniques are used in any area (with exception to advertising).
Pre-prints and other manuscripts
- Davison, A. (2016) New advances in causal inference. Part III Essay. (pdf)