Strong inference - maybe it is not strong enough?

One of my favorite papers on the scientific method is “Strong Inference” by John R. Platt.¹ In this perspective article, Platt argues that certain fields of science progress more rapidly than others because of their systematic application of the scientific method to tease out complexities and gain knowledge. The method, which he calls “Strong Inference”, is a modification of the age old Baconian inductive inference, with slight alterations.

Most of the method is the same - observe a particular phenomenon, develop falsifiable hypotheses, perform experiments, and based on results, pick the unfalsified hypotheses and repeat the cycle with newer sub-hypotheses. This methodology follows the Baconian inference and the Popperian falsification of hypothesis, with a key difference. Platt argues that the reason why most fields do not progress rapidly despite applying Baconian inference is because scientists in these fields typically stop with developing one hypothesis. Because they only come up with one hypothesis, which becomes their pet-hypothesis, falsifying it becomes painful as they inevitably get attached to it. This leads to stagnation, both because the hypothesis which should be falsified is not, or due to ego-battles with other scientists who manage to falsify it.

Platt’s solution, borrowed from T. C. Chamberlin, is to always develop multiple hypotheses before testing with experiments. This not only avoids the pet-hypothesis problem, but also fosters collaboration as multiple groups can go about falsifying any of these hypotheses. The clear language, the lucid points made, and the attractiveness of the method quickly made this one of my favorite papers. I took this method as the optimum method to do science, no questions asked. Recently, two articles have made me rethink my stance on this.

The first one,² evocatively titled “the abuses of Popper”, argues that true falsification of hypothesis is hard. In order to falsify a particular hypothesis, a specific experiment has to be designed that explicitly falsifies this. But in reality, it is rarely possible to come with such a precise experiment - there could always be other reasons why the result didn’t turn out as it should have as predicted by the hypothesis. As scientists, we try our best to control for the unpredictable aspects of the experiment, and then implicitly assume that the result is not due to problems with the experiment. That is, our good-faith assumption that the experiment works as we intended actually allows us to falsify our hypothesis.

It actually took me a while to truly get this - I guess that the blind belief in the scientific method was so ingrained that I didn’t see the implicit assumption used in interpreting our results. For example, if I measure the speed of light and show that the speed of light is different when the object is moving at different speeds, does this mean that I falsified the special theory of relativity? Or it was just experimental error?

I begin to see the reason for discord among scientists and why they still hold onto their pet-hypotheses - the same good-faith assumptions that they hold for their experiments turn into bad-faith assumptions for all other experiments. I can definitely see the advantage of the Popperian framework - it allows for consistent and methodical progress, but its Achilles heel is the implicit assumption for falsification. The “strong inference” method is much better as generating multiple hypothesis and performing an experiment that falsifies all but one of them weakens the necessity of this assumption.

The article also talks about a darker consequence of the falsification mindset - the lack of moral accountability for the science (as all I am doing is falsifying stuff) and the use of falsification to not accept / debunk climate change. I won’t go into this here, so make sure you read the article. It definitely began to crack my perception of the infallibility of the strong inference.

The second article³ amplifies these cracks further. It makes several salient points about the importance of constant conversation, debate, and evaluation of the methods used in science. Theoretical physics was one of the fields that wholeheartedly embraced the Popperian framework. Physicists applied this framework to generate a whole slew of hypotheses to explain phenomenon. Because each of these hypotheses were backed by these and not falsified experimentally, the field assumed that each of them were equally probable. And because experiments are costly both in terms of time and money, only a few hypotheses could be tested, causing a pool of untested hypotheses to always remain. The author argues that this ruthless application of the Popperian falsibility criteria has caused a stagnation of theoretical physics, a problem that would have been avoided if there was a continuous conversation about the philosophy of the scientific method currently being used.

I felt this article shed light on a different problem with strong inference - the cost of doing an experiment. Falsifying multiple hypotheses often implies multiple experiments, each of which takes time/money. Often, one has to ask how plausible each hypothesis is. is it worth doing an experiment to falsify it? Popperians and Kuhnians would argue that this is not a question a scientist can answer. The author argues that yes, one can never give a definitive answer to that question, but based on previous knowledge one can be reasonably certain. As with the speed of the light example, the probability of the experiment falsifying the special theory of relativity is low; not zero, but unlikely based on what is known.

In his paper, Platt implicitly acknowledges these problems. Multiple hypotheses and collaborations are proposed as solutions to the single hypothesis and good-faith assumption problem. The pitfalls of multiple hypotheses and the cost of an experiment is also hinted at, as in the below quote.

Problems of this complexity, if they can be solved at all,can be solved only by men generating and excluding possibilities with maximum effectiveness, to obtain a high degree of information per unit time-men willing to work a little bit at thinking.

This makes me like the paper even more. Yet, I feel the fog of certainty of “strong inference” slowly lifting. It is definitely a really good tool in our toolboxes, but it has significant problems one needs to be wary of. And this awareness might allow us to develop new ways of discovering knowledge.

Footnotes:

💭 DN's Umwelt

Recent Notes

Bookshelf

Welcome!

How to Take Smart Notes

Statistical Rethinking

Immune: A Journey into the Mysterious System That Keeps You Alive