My research interests are mainly centered around statistical and computational aspects (and the interplay of the two) of modern-day machine learning techniques. In particular, here are a few topics which I’m currently thinking about:

  • (Spurious) local minima of empirical risk landscapes – When performing empirical risk minimization to find an estimate of a parameter of estimate, for non-convex problems we could potentially end up in a non-optimal local minima, which we find to be poor estimators (for example with respect to generalization). How can we guarantee that we avoid these sub-optimal minima points, either by showing they don’t exist at all, or our choice of algorithm avoids them?
  • Optimization algorithms as implicit regularization – Clasically, we consider regularization to be the act of adding a penalty term to our optimization problem in order to favour some desired structural property of our estimator, or to perform it’s statistical performance (e.g skrinage methods for improved prediction performance). However, particularly for the complex models we now find ourselves attempting to fit, we’re beginning to understand that our choice of optimization algorithm or model (over)parameterizations will implicitly effect the optima we obtain and the speed at which we attain it. What can we generally say, both from a statistical and optimization perspective, about the properties of the estimators we obtain from this process?

From an applied perspective, I’m interested in (carefully) applying machine learning methods to medical and epidemiological problems – reading about the uses of data and problems in these areas is what got me into statistics in the first place (even though most of my time is spent on the mathematical side of things). The use of these methods to help improve people’s lives – for example, in early detection of heart disease and cancer and epidemic tracking – is only beginning, and there is great potential for us to do more.

Statistics is only useful provided we apply it – moreover these applications help give us theoretical insights – and so I’m generally interested in hearing about how these techniques are used in any area (with exception to advertising).


Pre-prints and other manuscripts

  • Davison, A. (2016) New advances in causal inference. Part III Essay. (pdf)