The no free lunch theorem for scientific research
Yesterday David and I went for pizza after work. As typical for our conversations, we spent quite some time discussing applied statistics and machine learning, and reached our usual conclusion that logistic regression is a wonderful model in so many problems.
However, finding logistic regression or other “simple” methods in research papers can be quite hard, as we tend to look for methodological novelty. As Kristin Lennox nicely summarized, “you don’t get a lot of points for doing really good statistics on really important problems, if these statistics were invented in 1950s”. (In particular, Cynthia Rudin investigated how much applied research goes into the most presitigious machine learning conferences).
This is one of the reasons for the phenomenon which everybody in the field knows too well: from time to time you take a paper claiming state-of-the-art performance (“They are 0.02% better on CIFAR-10 than others! Let’s apply it to my problem”), and then find out that the method requires heavy hyperparameter tuning and hours of playing with the brittleness that makes the method impossible to use in practice. And, what’s even worse, the performance isn’t that different from a simple baseline.
Similarly, there are voices from many statisticians raising the issue that several of the grandiose results, which often involve solving important problems with the state-of-the-art methodology, may be simply invalid.
To summarize, a perfect paper should use novel methodology, aim at solving an important problem, and be correct (which should go without saying). The no free lunch of scientific research says that you can pick two out of three, at most.
This “theorem” is not very serious and is, of course, not universal – there exist great papers, but achieving all three goals in one manuscript is very hard to execute and they are exceptions, not the standard. Additionally, I don’t want to dichotomise here: methodological novelty, problem importance and correctness have many facets and subtleties (and may also be hard to assess upfront!), so it’s better to think about the level of each of these traits desired in a study.
As a first-order approximation, I found myself usually doing research on either novel methodology (illustrated on toy problems, without direct applications) or working on problems, which I find practical and important, but which require standard and well-trusted tools (at least as a starting point and a baseline).
On a more positive note, some novel (as of today) methods will have become standard and well-trusted tools in the coming years, with a lot practical impact to come. And practical problems often lead to improvement on existing methods or asking fundamental questions (see Pasteur’s quadrant). And they usually are much harder to solve, than it seems at the beginning! Let’s finish with a quote from Andrew Gelman: “Whenever I have an applied project, I’m always trying to do the stupidest possible thing, that will allow me to get out of the project alive. Unfortunately, the stupidest possible thing, that could possibly work, always seems to be a little more complicated, than the most complicated thing I already know how to do.”