Is the Scientific Method Becoming Less . . . Scientific?

In my ongoing search to better understand how we reconcile the creative tension between subjective and objective measures of the world — including our ongoing (and thus far) elusive search for a better way of tracking how people learn — I took note of a recent New Yorker article that cast light on some emerging problems with the ostensible foundation of all objective research — the scientific method.

In the article, author Jonah Lehrer highlights a score of multiyear studies — ranging from the pharmaceutical to the psychological — in which core data changed dramatically over time. Drugs that were once hailed as breakthroughs demonstrated a dramatic decrease in effectiveness. Groundbreaking insights about memory and language ended up not being so replicable after all. And the emergence of a new truth in modern science — the “decline effect” — cast doubt on the purely objective foundation of modern science itself.

Without recounting the article in entire, there are several insights that have great relevance to those of us seeking to find a better way of helping children learn:

  • In the scientific community, publication bias has been revealed as a very real danger (in one study, 97% of psychology studies were proving their hypotheses, meaning either they were extraordinarily lucky or only publishing outcomes of successful experiments). The lesson seems clear: if we’re not careful, our well-intentioned search for the answers we seek may lead us to overvalue the data that tell us what we want to hear. In the education community, how does this insight impact our own efforts, which place great emphasis on greater accountability and measurement, and yet do so by glossing over a core issue — the individual learning process — that is notoriously mercurial, nonlinear, and discrete?
  • In the scientific community, a growing chorus of voices is worried about the current obsession with “replicability”, which, as one scientist put it, “distracts from the real problem, which is faulty design.” In the education community, are we doing something similar — is our obsession with replicability leading us to embrace “miracle cures” long before we have even fully diagnosed the problem we are trying to address?
  • In the scientific community, Lehrer writes, the “decline effect” is so gnawing “because it reminds us how difficult it is to prove anything.” If these sorts of challenges are confronting the scientific community, how will we in the education community respond? To what extent are we willing to acknowledge that weights and measures are both important — and insufficient? And to what extent are we willing to admit that when the reports are finished and the PowerPoint presentations conclude, we still have to choose what we believe?
Categories: Assessment, Learning

Tags: , , , , ,

Bookmark the permalink. Both comments and trackbacks are currently closed.


  1. Posted December 19, 2010 at 9:09 pm | Permalink

    I think that the question Sam raises at the end of the post is an excellent one. Given that science is not a stable source of information, what role should the results of scientific studies play in decision-making about education. I think that one answer is that the opinions of expert practitioners, working together under conditions in which there is real back-and-forth about ideas—should be among the primary institutions in policy design and implementation. Additionally, how well an intervention under consideration exemplifies important values should be considered.

    There is a similar controversy in medicine. Physicians using “evidence based methods” are supposed to be looking at the results of the latest literature when making treatment decisions. As well the literature they are meant to look at are double-blind, randomized clinical trials, or combinations of them. The problem is that these kinds of studies rarely apply to individual cases. First, the most popular statistical methods used to determine whether the results of a randomized clinical trial bear out the hypothesis under test do not apply to individual cases [1]. Second, almost no individual case resembles the conditions under which a trial is conducted. So sometimes with information at hand directly before her, the physician is warranted in making a decision that does not accord with what the current literature says to do.

    One useful purpose that evidence based medicine has served has been to correct for individual biases physicians might have that don’t have to do with what’s before them. For example, a few years ago I wanted to see if I ought to be worried about my weight. Everyone said I looked fine, although the risk statistics indicated that, at my weight, I was highly likely to get some forms of cancer, heart disease, and diabetes. Rotund doctors dismissed the statistics, while those less so did not. I think that in this case the statistics are quite good. If my larger-waisted physicians had been using evidence based medicine, they would not have been so influenced by their own judgements about themselves (I assume that this is what made the difference).

    Another point I think is important to make, that apparently the New Yorker author does not seem to understand, is that even those who believe that science will eventually get us to the truth, or that we are always generally progressing toward it, would claim that the current results of scientific experiments ought to be believed. “The default position in science is doubt, not certainty” [2]. Certain claims have withstood many, many critical experiments. Certain notions of Einstein’s, for instance, and the claim that all adaptation arises from natural selection, at least in part, are among these. The devil is in the details, however, and any one particular point is likely to be unclear.


    [1] Steven Goodman writes about the adequacy and applicability of the most-used statistical models. “Toward evidence-based medical statistics. 1: The P value fallacy.” Ann Intern Med, 130(12):995–1004, 1999 Jun 15.

    [2] Ian Tattersall, “What’s so special about science.” Download the PDF free of charge from This is a readable, insightful account of what science is and how it works.

  2. Posted December 20, 2010 at 12:48 pm | Permalink

    Adam! Thanks for sharing, and for providing new insights into this issue. Your experience with rotund/non-rotund doctors is very illustrative, and it will be interesting to see what the next year holds as far as education conversations and our ongoing search for better information that can be used to make decisions. Can we resist the Siren song of Certainty, and stay open to multiple ways of knowing? Time will tell . . .

  3. Posted December 22, 2010 at 12:10 am | Permalink

    Sam, I have shared this passion for the “elusive search” of how people learn, and the concern for its implications on democracies. The clues of the problem are found in the names we give to our concepts. The “scientific method of reasoning” is shortened to the “scientific method,” and finally just to “science.” This is how we have lost sight of what it is that we are even doing. It takes going back to books written by people like Immanuel Kant to understand that what we are seeking is “Epistemology as First Philosophy.” Then we understand that “science” is just a method of reasoning, and “reasoning” is only one of the 8 types of reasons humans provide in all their explanations.

    The Explanation Age is a book (now on Amazon) that describes these 8 reasons, and compares them to the 6 levels of Bloom’s Taxonomy, to expose the cracks in this construct that underlies our educational system of teaching and testing. My quest is to find thinkers who understand that the fundamental challenge to educational reform is replacing Bloom’s Taxonomy, and then work together to make this happen. This is more than a book plug; it is an answer to the “elusive search,” which has taken over 20 years of research. This is about replacing Industrial-Age models which inhibit understanding, innovation, transparency, and organizational success. Keep up the good fight, and I look forward to the possibility of joining forces.

  4. Posted December 30, 2010 at 3:25 pm | Permalink

    Hello Sam.

    Your search made me ask myself what the bar or standard might be. The scientific method might suggest there is data that supports goals and objectives. For example, if a goal or objective is to increase math scores, is there a technique that appears to work better than others? Or, is the goal or objective to improve math scores of North American children above those of Chinese children, specifically?

    It occurs to me that we first need to have a very clear view of the objective. Then we can collaborate and define what we feel is a desired result. The “best” method of learning will likely vary depending on the target group it’s directed at. One size won’t fit all. For example, some people learn better visually whereas others learn better audibly. For some clues, consider how Powerpoint presentations have evolved to suit the ever-growing array of CEO’s. Some need pictures, some prefer audio files. Others just want bullet points. But, it’s the same information – just presented and then processed differently, eh.

    The notion around the “decline effect” strikes me as relative to continued change in the broad environment children find themselves. Sort of like when we train our muscles for certain physical effort. We need to do different exercises to keep muscle fiber breaking and re-bonding to increase strength and stamina. Are kids competing against themselves, or other groups? What is consistent in their environments relative to others?

    Look at standards for obesity. We continue to reduce the criteria for fitness in this country and the result is fatter kids. With that appear to be test and related numbers that make them appear less intelligent as well.

    Kids are certainly less fit today than prior generations. But, are they less bright? Or, are expectations different?

    So, if we can define what we want, and be specific about that – i.e. better math scores in calculus than those demonstrated in China at the age of eleven, then we’ve established the bar that can raise with results, and point to definable results.

    I read the article. I’ve pondered your question(s). I considered the other comments. But, I did not see anyone clearly defining the standard or objective. Once you have that I think you can then sort out an ever-changing process of improvement that is likely heterodoxal in nature (my definition: appreciate the tradition, but keep asking ever more penetrating questions to get to the “truth of the day”).

    More later. Maybe.

    Brian Patrick Cork

  5. Posted January 4, 2011 at 4:00 pm | Permalink

    Thanks for your thoughtful reply, Brian. I think you get to the crux of the matter quickly — people aren’t clear on what the standard should be, and, in the absence of clarity, choose the lowest common denominator — test scores — because it’s the clearest and least complicated. But of course learning and teaching is as complicated as it gets — it’s nonlinear, specific, highly individual, and deeply relational. So there needs to be some acknowledgment of that, coupled with greater clarity about what sorts of visible and invisible standards are worth aspiring for.

  6. Posted January 4, 2011 at 4:02 pm | Permalink

    John! Thanks for sharing the news of your book, which I will order immediately. Sounds like we’re wrestling with similar questions. Here’s to many more conversations in the weeks and months ahead!

  • Read Sam’s Books