Skip to Main Content

Peer review is everybody’s favorite punching bag in science these days, and for good reason: As we and others have written, it’s secretive, susceptible to bias, and often appears to fail at keeping scientific publishing rigorous and honest.

But peer review is essential for the smooth operation of the scientific publishing apparatus. Without the imprimatur, however imperfect, of independent scholars, research papers would all in effect be titled “Trust us …”

The problem is, we have scant research into how peer review functions at its job of keeping out bad science. Journals don’t devote sufficient attention to studying the quality of their peer review systems, nor do they make those data available to outside scholars.


That could be changing.

A pair of scholars is calling for a modest moonshot to improve the system, which they call (rightly, we think) a “black box.” Writing in this week’s issue of Science, Carole Lee, a philosopher at the University of Washington, and David Moher, of the Ottawa Hospital in Ontario, Canada, argue that publishers should become much more transparent about their peer review practices.


“Though the vast majority of journals endorse peer review as an approach to ensure trust in the literature, few make their peer review data available to evaluate effectiveness toward achieving concrete measures of quality,” they write. Measures such as consistency in reviews, for example, would be helpful — did most papers get approved with mixed reviews or with flying colors?

“There is too little sound research on journal peer review; this creates a paradox whereby science journals do not apply the rigorous standards they employ in the evaluation of manuscripts to their own peer review practices.”

Lee and Moher propose that publishers spend 1 percent of their budgets on research into the effectiveness of their peer review systems — a number based on what the Human Genome Project spent to investigate the ethical, legal, and social implications of its efforts.

The obvious question here, of course, is: Why would publishers spend money that, at least so far, they haven’t felt the need to shell out? As the saying goes, why buy the cow when the milk’s free?

But Lee and Moher offer a few reasons that publishers ought to find compelling. The first involves fighting off incursions from predatory outfits that promise quality peer review on par with legitimate journals but rarely deliver. Being able to point to data showing superior reviewing would be a boon for non-predatory outfits. One attempt, PRE — or Peer Review Evaluation — has been around for a few years now. (Disclosure: One of us, I.O., was an advisor to PRE before it was acquired by the American Association for the Advancement of Science.)

Similarly, journals are gradually starting to look beyond impact factor as the most important signal of quality. Strong peer review could join emerging metrics like reproducibility and the willingness to share data as indicators that one journal is more reliable than another. We’ve even suggested a Transparency Index.

Until journals and publishers start taking a closer look at their own peer review processes, Lee and Moher write, “inadequately reported research will continue to waste time and resources invested by authors, reviewers, journals, academic institutions, funders, study participants, and readers — and limit the credibility and integrity of science.”

Fortunately, there have been some attempts to pry open the black box of peer review. In a baby step, a group of editors at the British Journal of Surgery created an online forum that allowed manuscripts to be peer-reviewed in the open. In a paper last month describing the experiment, published in PLOS ONE, the editors say the results were mixed. “Open online peer review is feasible in this setting,” they concluded, “but it attracts few reviews, of lower quality than conventional peer reviews.” (However the comparison may have not been quite fair, as Richard Smith, former editor of the British Medical Journal, wrote in response.)

Still, it’s perfect timing to keep this discussion alive and well: Later this summer the world’s small band of scholars who study peer review will gather in Chicago for the Peer Review Congress, which is held every four years.

Scientists will present their studies on what is and isn’t working about peer review, and novel ways to fix it. But just think how much more they’d have to go off of if journals pried open their black boxes and let some data out.

  • This “black box” was the main reason, why I lost interest in publishing my research. You put a lot of work in doing everything you can, and then you enter the Impermeable Fog shrouding the Forest of Reviews, which often is just a game of Russian roulette. For many, but not all. Some, obviously, can publish whatever they want. What demotivated me mostly was not the basic negative attitude of many of our reviewers’ reports, but to see that papers with less data, odder conclusions or generally badly conceived passed the “peer” review process in the very same journal with apparently very little resistance. Or to know that you pointed out the errors in a paper, provided guidelines how to fix them, but the authors just ignored everything you wrote because the other reviewer (for whatever reason) said “publish-as-is” and the editor went with the latter, so the published paper remains a mess.

    But there is a quick fix to the most imminent problems with the current peer review applied by most journals. Just replace “confidential” by “transparent”. An example: everyone can inquire in Sweden how scientists spend their research money, but when you ask a Swedish editor about his dubious decision to publish an obviously badly done paper, he refers you to “peer review confidentiality” ( How can this be? Shouldn’t science hold to higher standards than economics or politics when it comes to transparency?
    Even as reviewer, if you think a decision is fishy, you can’t do anything about it, being bound by “peer review confidentiality”. Of course, you can write a formal comment, but the Impermeable Fog puts the barrier higher for a “reply-to” than for the original paper (for natural reasons).

    As I scientist working between different fields (at the cross-roads of molecular phylogenetics and palaeontology on various organisms, extratropical trees, marine unicellular foraminifers, once a moss), c. 75% of our peer reviewers’ reports in journals enforcing peer review “confidentiality” were pointless (positively and negatively) to utterly biased and (a few) bluntly unethical. PS The journal’s impact factors played little role, we got exhaustive, brilliant, critical reviews for papers we submitted to specialised journals with IF ~1 and unbelievable idiotic in journals with IF > 5.
    For those submitted to journals employing peer review transparency, it was the complete opposite: one was toothless (because the reviewer apparently didn’t dare to criticise us), but all others were what you expect as properly done reviews (compared to COPE’s guidelines for reviews:
    In case of our PeerJ papers (e.g., my first there,, my last) – it is the authors’ call to publish the review process (reports, authors’ response, editor’s comments) – it was 100% best peer review practise, and there was no reason to not credit the work of the peers (anonymous or those that signed) and editors.

    Even though a lot of scientists seem to be unhappy with the status quo, little can be done if journals with non-transparent peer review keep on getting submissions. And all the high-fly journals (career scientists need impact and branding is important) stick with peer review confidentiality. I think the best way would be to pressure all journals to make their review processes transparent. Naturally, it’s Don Quixote against the Windmills (be the 25th to sign up:

    Fact is, when you are in the science business, you often cannot afford to fight the Fog. And once you are free (out), you don’t bother about it anymore. But one probably should, as difficult as it is (hence,

    Cheers, Guido

  • The following comments about deficiencies of peer review may be more germane to prior posts of the authors regarding the subject. But I post them on the authors’ most recent post on the subject.

    1. Essentially all peer reviewed literature regarding differences in outcome rates suffers from a failure to recognize patterns by which measures of such differences tend to be affected by the prevalence of an outcome, including, but not limited to, the pattern whereby the rarer an outcome the greater tends to be the relative difference in experiencing it and the smaller tends to be the relative difference in avoiding it. See references at bottom.

    2. Commonly peer-reviewed literature will reflect the view that reducing some adverse health outcome should reduce relative demographic differences in rates of experiencing it, as reflected in the many statements over several decades along the lines of “despite declining mortality relative differences in mortality increased [or persisted].” Exactly the opposite is the case. Reducing the prevalence an outcome, which generally involves restricting it to those most susceptible to it, tends to increase relative differences rates of experiencing the outcome, while reducing relative differences in rates of avoiding the outcome.

    3. Virtually no peer-reviewed literature recognizes that is even possible for the relative difference in a favorable outcome and the relative difference in the corresponding adverse outcome to change in opposite directions as the prevalence of an outcome changes, even though the National Center for Health Statistics recognized more than a decade ago that this would tend to occur systematically. See references 1-3.

    4. Commonly peer-reviewed literature, especially that involving racial/ethnic and socioeconomic differences in cancer outcomes, will discuss relative differences in survival and relative differences in mortality interchangeably (often stating that the research is analyzing the former when in fact it analyzes the latter). Invariably, such analyses fail to recognize that the two relative differences tend to change in opposite directions over time, or, for example, that relative differences in mortality will almost always be greater among the young than the old, while relative differences in survival will almost always be greater among the old than the young. See especially reference 4 (Section A) and reference 5.

    5. Commonly peer-reviewed literature on subgroup effects/interaction/reporting heterogeneity will be premised on the expectation that, absent a subgroup effect, a factor that affects an outcome rate will cause equal proportionate changes in different baseline rates for the outcome. Invariably, such literature fails to recognize (a) the reasons to expect that a factor that affects an outcome rate will tend to cause a larger proportionate change in the outcome for the group with the lower baseline rate for the outcome, while causing a larger proportionate change in the opposite outcome rate for the other group; (b) that if a factor causes equal proportionate changes in different baseline rates for an outcome, it will necessarily cause different proportionate changes in the rates for the opposite outcome. See especially reference 2 at 41-43.

    6. To my knowledge, no peer-reviewed literature discussing explanations for reasons why a factor caused different proportionate changes in different baseline rates for an outcome has shown an awareness of the possibility that the factor would show an opposite pattern of the comparative size of effects on the opposite outcome, much less has discussed the fact that this tends usually to occur or in fact occurred in the particular situation examined. See especially reference 1 at 339-341.

    7. Frequently peer-reviewed literature will discuss changes in a particular measure of difference between outcome rates, without any mention that a different measure will yield an opposite conclusion. That occurs even when the measure that would yield an opposite conclusion is one more commonly used in the circumstances.

    8. Some peer-reviewed literature has discussed that a relative difference and the absolute difference between the rates at which two groups experience an outcome can or did change in opposite directions over time. But no such literature has ever recognized that anytime that happens, the relative difference in the opposite outcome will necessarily have changed in the opposite direction of the first relative difference and the same direction as the absolute difference. See reference 1 at 335-336 and 2 at 14 note 26.

    9. Commonly peer-reviewed literature will use “%” or “percent” when it means percentage points. Failure to distinguish between percent changes and percentage point changes has even led to the situation where observers read studies with essentially the same findings as reaching opposite conclusions. See references 6 and 7.

    10. Substantially more than half the time, when peer-reviewed literature attempts to explain that one rate is X times as high as another rate, it will state that the first rate is X times “higher” than the other. The New England Journal of Medicine is a notable exception. One hopes there are others See reference 8.

    11. The above points apply only to literature that survived peer review. But if an understanding of the issues addressed above were commonly recognized among peer reviewers, one would expect the issues eventually to be reflected in peer-reviewed literature in a way that so far is not in evidence.

    12. Peer reviewers commonly deal with issues much more complex than those discussed above. But peer reviewers’ handling of these relatively simple issues provides little basis for belief in the soundness of peer review regarding more complex issues.

    13. All statements about peer reviewers apply to journal statistical editors/consultants.

    1 “Race and Mortality Revisited,” Society (July/Aug. 2014)

    2. Comments of J. Scanlan for Commission on Evidence-Based Policymaking (Nov. 14, 2016),_2016_.pdf

    3. “The Mismeasure of Health Disparities,” Journal of Public Health Management and Practice (July/Aug. 2016)

    4. Comments of J. Scanlan for the Commission on Evidence-Based Policymaking (Nov. 28, 2016),_2016_.pdf

    5. Mortality and Survival Page of

    6. Percentage Points subpage of the Vignettes page of

    7. Spurious Contradictions subpage of the Measuring Health Disparities page of

    8. Times Higher subpage of the Vignettes page of

Comments are closed.