
If your doctor diagnoses you with chronic fatigue syndrome, you’ll probably get two pieces of advice: Go to a psychotherapist and get some exercise. Your doctor might tell you that either of those treatments will give you a 60 percent chance of getting better and a 20 percent chance of recovering outright. After all, that’s what researchers concluded in a 2011 study published in the prestigious medical journal the Lancet, along with later analyses.
Problem is, the study was bad science.
And we’re now finding out exactly how bad.
Under court order, the study’s authors for the first time released their raw data earlier this month. Patients and independent scientists collaborated to analyze it and posted their findings Wednesday on Virology Blog, a site hosted by Columbia microbiology professor Vincent Racaniello.
The analysis shows that if you’re already getting standard medical care, your chances of being helped by the treatments are, at best, 10 percent. And your chances of recovery? Nearly nil.
The new findings are the result of a five-year battle that chronic fatigue syndrome patients — me among them — have waged to review the actual data underlying that $8 million study. It was a battle that, until a year ago, seemed nearly hopeless.
When the Lancet study, nicknamed the PACE trial, first came out, its inflated claims made headlines around the world. “Got ME? Just get out and exercise, say scientists,” wrote the Independent, using the acronym for the international name of the disease, myalgic encephalomyelitis. (Federal agencies now call it ME/CFS.) The findings went on to influence treatment recommendations from the CDC, the Mayo Clinic, Kaiser, the British National Institute for Health and Care Excellence, and more.
But patients like me were immediately skeptical, because the results contradicted the fundamental experience of our illness: The hallmark of ME/CFS is that even mild exertion can increase all the other symptoms of the disease, including not just profound fatigue but also cognitive deficits, difficulties with blood pressure regulation, unrestorative sleep, and neurological and immune dysfunction, among others.
Soon after I was diagnosed in 2006, I figured out that I had to rest the moment I thought, “I’m a little tired.” Otherwise, I would likely be semi-paralyzed and barely able to walk the next day.
The researchers argued that patients like me, who felt sicker after exercise, simply hadn’t built their activity up carefully enough. Start low, build slowly but steadily, and get professional guidance, they advised. But I’d seen how swimming for five minutes could sometimes leave me bedbound, even if I’d swum for 10 minutes without difficulty the day before. Instead of trying to continually increase my exercise, I’d learned to focus on staying within my ever-changing limits — an approach the researchers said was all wrong.
A disease ‘all in my head’?
The psychotherapy claim also made me skeptical. Talking with my therapist had helped keep me from losing my mind, but it hadn’t kept me from losing my health. Furthermore, the researchers weren’t recommending ordinary psychotherapy — they were recommending a form of cognitive behavior therapy that challenges patients’ beliefs that they have a physiological illness limiting their ability to exercise. Instead, the therapist advises, patients need only to become more active and ignore their symptoms to fully recover.
In other words, while the illness might have been triggered by a virus or other physiological stressor, the problem was pretty much all in our heads.
By contrast, in the American research community, no serious researchers were expressing doubts about the organic basis for the illness. Immunologists found clear patterns in the immune system, and exercise physiologists were seeing highly unusual physiological changes in ME/CFS patients after exercise.
I knew that the right forms of psychotherapy and careful exercise could help patients cope, and I would have been thrilled if they could have cured me. The problem was that, so far as I could tell, it just wasn’t true.
A deeply flawed study
Still, I’m a science writer. I respect and value science. So the PACE trial left me befuddled: It seemed like a great study — big, controlled, peer-reviewed — but I couldn’t reconcile the results with my own experience.
So I and many other patients dug into the science. And almost immediately we saw enormous problems.
Before the trial of 641 patients began, the researchers had announced their standards for success — that is, what “improvement” and “recovery” meant in statistically measurable terms. To be considered recovered, participants had to meet established thresholds on self-assessments of fatigue and physical function, and they had to say they felt much better overall.
But after the unblinded trial started, the researchers weakened all these standards, by a lot. Their revised definition of “recovery” was so loose that patients could get worse over the course of the trial on both fatigue and physical function and still be considered “recovered.” The threshold for physical function was so low that an average 80-year-old would exceed it.
In addition, the only evidence the researchers had that patients felt better was that patients said so. They found no significant improvement on any of their objective measures, such as how many patients got back to work, how many got off welfare, or their level of fitness.
But the subjective reports from patients seemed suspect to me. I imagined myself as a participant: I come in and I’m asked to rate my symptoms. Then, I’m repeatedly told over a year of treatment that I need to pay less attention to my symptoms. Then I’m asked to rate my symptoms again. Mightn’t I say they’re a bit better — even if I still feel terrible — in order to do what I’m told, please my therapist, and convince myself I haven’t wasted a year’s effort?
Many patients worked to bring these flaws to light: They wrote blogs; they contacted the press; they successfully submitted carefully argued letters and commentaries to leading medical journals. They even published papers in peer-reviewed scientific journals.
They also filed Freedom of Information Act requests to gain access to the trial data from Queen Mary University of London, the university where the lead researcher worked. The university denied most of these, some on the grounds that they were “vexatious.”
Critics painted as unhinged
The study’s defenders painted critics as unhinged crusaders who were impeding progress for the estimated 30 million ME/CFS patients around the world. For example, Richard Horton, the editor of the Lancet, described the trial’s critics as “a fairly small, but highly organised, very vocal and very damaging group of individuals who have, I would say, actually hijacked this agenda and distorted the debate so that it actually harms the overwhelming majority of patients.”
Press reports also alleged that ME/CFS researchers had received death threats, and they lumped the PACE critics in with the purported crazies.
While grieving for my fellow patients, I seethed at both the scientists and the journalists who refused to examine the trial closely. I could only hope that, eventually, PACE would drown under a slowly rising tide of good science, even if the scientific community never recognized its enormous problems.
But with the National Institutes of Health only funding $5 million a year of research into chronic fatigue syndrome, it seemed like that could take a very long time.
Then last October, David Tuller, a lecturer in public health and journalism at the University of California, Berkeley, wrote in Virology Blog a devastating expose of the scientific flaws of the trial. Tuller described all the problems I had seen, along with several more. The project was a remarkable act of public service: He isn’t a patient, yet he spent a year investigating the trial without institutional support, legal backing, or remuneration.
And, at last, the criticisms gained traction.
Racaniello and 41 other scientists and clinicians published an open letter to the Lancet calling for an independent investigation into the trial and saying “such flaws have no place in published research.” Rebecca Goldin, the director of Stats.org, an organization that works to improve the use of statistics in journalism, eviscerated the trial’s design in a 7,000-word critique.
In the meantime, a Freedom of Information Act request from Australian patient Alem Matthees was making its way through the legal system.
Matthees had asked for the anonymized data necessary to analyze the study using its original standards for success, but Queen Mary University of London had refused the request, arguing that malicious patients would break the anonymization and publish the participants’ names to discredit the trial. It again cited the death threats.
The court rejected these claims a month ago, calling them “wild speculations” and pointing out that the researchers themselves acknowledged in court that neither they nor PACE participants had received death threats.
Startling results from a re-analysis
Just before releasing the data,Queen Mary University of London did its own re-analysis on the question of how many patients got better, at least a little bit. Their data showed that using the study’s original standards, only 20 percent of patients improved with cognitive behavior therapy or exercise in addition to medical care, not 60 percent as claimed in the Lancet.
And even the 20 percent figure might be misleading, because the re-analysis also found that 10 percent of participants improved after receiving only standard medical care. That suggests that 10 percent in each of the treatment groups would likely have improved even without the exercise or therapy, leaving only 10 percent who were significantly helped by those interventions.
As for the claim that 22 percent of patients who received either treatment made an actual recovery? That went up in smoke when Matthees analyzed the raw data with the help of his colleagues and statisticians Philip Stark of the University of California, Berkeley, and Bruce Levin of Columbia University.
Their analysis showed that had the researchers stuck to their original standards, only 4.4 percent of the exercise patients and 6.8 percent of the cognitive behavior therapy patients would have qualified as having recovered, along with 3.1 percent of patients in a trial arm that received neither therapy.
Importantly, there was no statistically significant difference between these recovery rates.
The PACE researchers, the editor of the Lancet, and the editors of Psychological Medicine (which published the follow-up study on recovery) all declined to comment for this article.
Simon Wessely, president of the UK Royal College of Psychiatrists, defended the trial in an email exchange with me. He argued that some patients did improve with the help of cognitive behavior therapy or exercise, and noted that the improvement data, unlike the recovery data, was statistically significant. “The message remains unchanged,” he wrote, calling both treatments “modestly effective.”
Wessely declined to comment on the lack of recovery. He summarized his overall reaction to the new analysis this way: “OK folks, nothing to see here, move along please.”
‘A classic bad study’
But it doesn’t appear that outside researchers are ready to “move along.”
After reviewing the new analysis, Jonathan Edwards, a professor emeritus of medicine at University College London said he was unconvinced that these small subjective improvements indicated the patients genuinely felt better. “They’ve set this trial up to give the strongest possible chance of there being a placebo effect that you can imagine,” he said.
“This is a classic bad study,” said Ron Davis, director of the Stanford Genome Technology Center and director of the Science Advisory Board of the End ME/CFS Project. He emphasized an additional problem: The study used such a broad definition of the disease that it likely included many patients who didn’t truly have ME/CFS at all.
“The study needs to be retracted,” Davis said. “I would like to use it as a teaching tool, to have medical students read it and ask them, ‘How many things can you find wrong with this study?’”
Retractions are rare, however, and erasing the impact of this flawed research will take much work for years to come.
After a sustained effort by ME/CFS advocates, the federal Agency for Healthcare Research and Quality, just changed its recommendation to read that there is insufficient evidence to justify cognitive behavior therapy or graded exercise. But many more public health agencies continue to point patients toward them.
And efforts to propagate this approach continue: A trial of graded exercise in children with ME/CFS has recently begun, and patients are protesting it.
Watching the PACE trial saga has left me both more wary of science and more in love with it. Its misuse has inflicted damage on millions of ME/CFS patients around the world, by promoting ineffectual and possibly harmful treatments and by feeding the idea that the illness is largely psychological. At the same time, science has been the essential tool to repair the problem.
But we shouldn’t take solace in the comforting notion that science is self-correcting. Many people, including many very sick people, had to invest immense effort and withstand vitriol to use science to correct these mistakes. And even that might not have been enough without Tuller’s rather heroic investigation. We do not currently have a sustainable, reliable method of overturning flawed research.
And rectifying PACE will take more than exposing its flaws. The lingering doubt it has cast on the illness will only be fully dispersed when we’ve finally figured out what’s really going on with the disease.
For that, we need to invest in some serious, good science. The kind I continue to love.
Julie Rehmeyer is a math and science writer. Her memoir “Through the Shadowlands,” describing the science and politics of chronic fatigue syndrome and other poorly understood illnesses, will be published by Rodale in May.
I am interested in Simon Wessely’s assertion that the revised PACE trial recovery were consistent with previous studies. Can anyone find any evidence of that? I failed to find any.
The protocol-specified recovery criteria for the PACE trial used a threshold for normal CFQ fatigue that was derived from a validation study that lists Wessely as a co-author (Chalder et al., 1993, PMID 8463991). Furthermore, one of the CBT studies Wessely was involved in used a normal SF-36 physical function threshold of 83 points or more (Deale et al., 1997, PMID 9054791). A previous GET study (Fulcher & White, 1997, PMID 9180065) stated that 69 points for SF-36 physical function score is abnormal, and that the “normal or usual scores are 14 for Chalder questionnaire” (Likert scoring). Another study involving White (lead PACE trial investigator) used 80 points or more as a threshold for normal SF-36 physical function (Knoop et al., 2007, PMID 17426416). Based on these studies, the protocol-specified PACE trial recovery criteria is more consistent with previous studies than the revised version.
I have come across several other CFS related papers in general referring to 80 points or more as a threshold for normal SF-36 physical function. The substantially weakened, post-hoc thresholds for normal fatigue and physical function first appear in the 2011 Lancet paper on the PACE trial results, not before it. Subsequent papers on CBT and GET have used these revised thresholds too. I also noticed that prior to the 2011 Lancet paper, CBT/GET proponents Knoop & Bleijenberg regarded 60-70 points for SF-36 physical function as severe impairment, but when the PACE trial was published, these scores were suddenly regarded as strict criteria for recovery based on healthy scores (they were neither strict nor based on healthy people).
The above is inconsistent with Wessely’s recollection of events and seems more consistent with the commonly expressed view that the thresholds for normal were substantially lowered after the worse-than-predicted PACE trial results were in.
As for the other assertions made:
1) The issues may be “covered extensively” on the QMUL PACE FAQ, but they are not all convincingly addressed, and other important issues remain unaddressed.
2) The recovery criteria were not an “overly harsh set of criteria that didn’t match what most people would consider recovery”. In my opinion it is the opposite, the revised criteria are not what most people and patients I know would consider recovery from CFS. How can the revised criteria be “reasonable and meaningful” when there was overlap with the trial eligibility criteria for severe disabling fatigue? How many people of middle age would accept a SF-36 physical function score of 60 points as healthy when it requires many impairments? As for whether scores of 60-75 points are common or normal for healthy individuals of working age, people can judge for themselves when looking at normative data from population samples:
https://sites.google.com/site/pacefoir/ONS-1992_WAP-NLTD.png
3) Re recovery was changed “before a single piece of data had been looked at of course”. This is rather doubtful given that the statistical analysis plan, finalized shortly before the unmasking of trial data, mentions nothing whatsoever about the changes to the recovery criteria. Moreover, two components of the revised recovery criteria were first introduced in the 2011 Lancet paper (two years before the recovery paper was published in Psychological Medicine), was described therein as a post-hoc analysis, and given in response to a reviewer’s request i.e. several months after the unmasking of trial data.
4) Wessely is right to point out that the PACE trial has strengths of a good quality study (e.g. allocation concealment, large sample size, standardized manuals, good therapeutic alliance, and low losses to follow-up); but it also has weaknesses of a poor quality study that should also be acknowledged e.g. the lack of masking (common and largely unavoidable in such research), insufficient placebo controls, uncertain generalisability to the wider patient community as 80% of CFS candidates were excluded and CFS criteria were used that are widely regarded as flawed and outdated, questionable post-trial revisions to the protocol, and using a version of pacing that does not match what pacing is to most patients. The lack of meaningful improvements to any objective outcome, and the lack of significant difference between groups at 2.5 year follow-up, suggest that CBT and GET are not particularly effective.
The last sentence should read: “The lack of meaningful improvements to any objective outcome, and the lack of significant difference between groups at 2.5 year follow-up for the self-reported measures, suggest that CBT and GET are not particularly effective.”
Almat, I emailed Simon Wessely to ask him some of these questions, but he refused to answer.
The exchange is here https://t.co/GGcb3xGovk
Posted on behalf of Margaret Williams:
Simon Wessely is at pains to distance himself from involvement with the PACE trial, but once again he seems to have overlooked the facts.
The Trial Identifier is clear:
“Section 4. TRIAL MANAGEMENT
4.1 WHAT ARE THE ARRANGEMENTS FOR THE DAY TO DAY MANAGEMENT OF THE TRIAL?
The trial will be run by the trial co-ordinator who will be based at Barts and the London, with the principal investigator (PI), and alongside two of the six clinical centres. He/she will liaise regularly with staff at the Clinical Trials Unit (CTU) who themselves will be primarily responsible for randomisation and database design and management (overseen by the centre statistician Dr Tony Johnson), directed by Professor Simon Wessely, in collaboration with Professor Janet Darbyshire at the MRC CTU.
4.4. WHAT WILL BE THE RESPONSIBILITIES OF THE NAMED COLLABORATORS?
Prof Simon Wessely will oversee the CTU”
It also needs to be recalled that the post of Statistician Clinical Trials Unit Division of Psychological Medicine Ref No: 06/A09 was described as the “Johnson_Wessely_Job” (07/07/2006) at The Institute of Psychiatry where: “The team works under the direction of Professor Simon Wessely, the Unit Director. The team is supported by the regular input of a Unit Management Group from within the Institute of Psychiatry. The statisticians within the Unit also have regular supervision meetings with Dr Tony Johnson from the MRC Clinical Trials Unit. The post holder will be directly responsible to the CTU Manager (Caroline Murphy), supervised by the CTU Statistician (Rebecca Walwyn) and will be under the overall direction of the Head of Department, Professor Simon Wessely”.
So please, Professor Wessely, stop dissembling, as you are fooling no-one but yourself.
Despite the fact that post-hoc changes showed reported results that were up to five times better than those derived from the original protocol, you continue to defend what has been described by many as “fraud” in the PACE trial.
Can it be said that the President of The Royal College of Psychiatrists condones what is widely considered to be scientific fraud?
Margaret Williams
http://www.margaretwilliams.me
Breaking news here in the UK is that Professor Peter White, one of the main co-authors of the PACE study, has resigned from medical practice. No doubt to avoid disciplinary action by the General Medical Council for misleading the public and misappropriation of public funds (£5,000,000)
Are there links for this news?
All of us use self-reflection as a way of citically appraising our own work. PACE researchers might try to justify their harmful work to other academics, researchers and to patients but can they truly look inwardly and justify it to themselves?
I just want to thank Eros Dervishi, the artist who created the picture at the beginning of this essay. Usually we get a pretty model leaning on her hand, looking slightly tired. This picture portrays the load of having this disease better than anything I’ve ever seen used with an article. Thank you.
Very true!
Question for the afflicted: Did identification of the affliction help in any way ? I know that many doctors dismiss CFS/ME as you just being a hypochondriac. Have you ever been asked about the many symptoms which, together, seriously suggest CFS ? I read a checklist some years ago which suggested that if you ‘scored’ more than 16 of the 34(?) symptoms, that you probably had it. My score was 23, I think. If knowing, in itself, helps, then I suggest that the search for markers, then a cure, would be useful.
At the moment, researchers at Columbia U, Cornell U, Nova U in Miami area, Utah, San Diego, Stanford U, and Griffith U in Australia are working on precisely that: a biomarker. Most of the proposed biomarkers are based upon immune or metabolic abnormalities or the presence of viruses. This research is lightyears ahead of the PACE beliefs that talking therapy and graded exercise can actually cure these patients – some of whom are bedridden with feeding tubes.
Jen,
This is exactly what I mean about graded exercise therapy being misunderstood! It does not work as a separate activity because the variation in exertion of the rest of the day makes it mean nothing in isolation. Pacing is just spreading out activities into manageable chunks. There is nothing graded about it. To do graded exercise therapy effectively you have to manage your activity, that is part of it but not all of it because again with just activity management there is nothing graded about it.
The way I employ GET is getting a baseline by looking at the TOTAL amount of steps I do in a day and what I can manage for 5 days out of 7. 2 days in my week are rest days where the TOTAL amount of steps I do in a day are less than my baseline. When I feel that my symptoms have not been worse than usual I will increase my target amount of steps in a day to 20% above my baseline. I will not do this more than once a fortnight. This is a GET regime because it works with a baseline of what I can manage for 5 days out of 7 and activity increase does not exceed 20% in a fortnight.
I reiterate my main point, GET cannot work at even a theoretical let alone a practical level unless total exertion is measured not just one isolated exercise task in a whole day. For example, Wednesday I walked to my car, drove across town and walked from my car again to my friend’s house. Thursday I did not leave the house. By classic GET Wednesday would be an exercise day and Thursday would be a rest day. However, looking at my tracker data I actually did more steps so more exertion on Thursday cleaning out my pets. Exertion measures need to be accurate for effective grading, we have not had available technology to do that before but we now do so let’s use it!
The PACE trial protocol set out to measure activity exactly as you recommend. They bought actometers and measured activity at entry to the trial, but mysteriously didn’t make the same measurement at the end, 52 weeks later. A review of several previous Dutch CFS trials that used actometers shows no increase in activity.* Those trials took place before or during PACE and may well have influenced the PACE authors to drop yet another objective measure.
*http://www.ncbi.nlm.nih.gov/pubmed/20047707
Layla, we do all know this already. It still doesn’t work for most of us. Even leaving aside the issue that I could exhaust myself doing bench presses or singing Wagner yet not actually clock any steps, so we still don’t have accurate enough energy expenditure monitors.
Look up http://www.recoveryfromcfs.org, a description of pacing used as a method for gradually increasing activity. The philosophical difference between pacing and GET isn’t whether you gradually increase your overall activity or not but whether you do it regardless of whether you feel well enough to or not.
As someone with ME/CFS whose health was severely damaged by graded exercise, thank you for publishing this very important story.
It is more than about time that patients, doctors and the public are informed about this fraud perpetrated by a privileged group of pseudo-scientists and charlatans. Finally people are starting to realise that the Empereror has no clothes.
Libertarians oppose any use of political authority or governmental force, to impose “solutions” upon people, that those people fear or find harmful, regardless of what real or imaginary social benefit might theoretically happen, if the proponents are 100% correct. We realize that on occasion, social progress may be a wee bit slower, for having allowed people the liberty of making up their own minds. However, in the majority of cases, it is the sheer ignorance of narcissistic intellectuals, who refuse to allow independent inquiry into the basis of their theories, that leads those same narcissistic intellectuals to propose forcing the rest of us to perform an action, that to most of us seems more harmful than good. The narcissist derives a self-satisfaction that substitutes for the geniune love and appreciation felt by normal people when we thank someone for doing a kind deed or providing a helpful idea. It is evident that the self-delusional behavior seen among the researchers at Queen Mary University, complete with the last-minute revision of their findings, amending them to show a 20 percent improvement instead of a 60 percent improvement, and the effort to hide behind the paranoid ravings about unproven threats from either real or imaginary threat-makers, were an effort to maintain the self-important illusion, that their research work was meaningful. Just as serial rapists locked in institutions for the criminally insane, may imagine that they showed their rape victims some sexual pleasure (and miss the fact that those victims were too busy fearing for their lives to notice any bodily sensations that might otherwise have been pleasing, in the course of voluntary intercourse), these junk-science practitioners are imagining that they have helped someone, by their efforts to justify the use of force to inflict pain and suffering upon sick people, made sicker by the treatments they proposed.
Just as the Stanley Milgram experiment left a lot of frightened people in it’s wake, realizing how easily they could be tricked into harming others, there will be conscientious health professionals who realize that relying upon this bit of junk science, caused serious therapeutic setbacks for patients whom they had sworn to help, and are going to feel emotional distress, knowing that they were taken in by the junk science.
That’s a sound argument for revealing at once, for scientific peer review, the data that come under question. The courts reached the only possible right conclusion in this case. Sadly, while doing that, the junk science continued to be believed, and to do harm. Just like the sadistic rapists spending decades in mental institutions, for using force against their victims, these researchers advocated using force against patients. Both the rapists and the researchers imagine that someone benefitted from being so forced. And both the rapists and the researchers refuse to listen to any of their victims’ pleas, to stop forcing harm upon us.
What hasn’t been mentioned so far is that CBT and graded exercise are nice little earners, particularly when the health insurance companies and the NHS are looking for experts to tell them of a nice, cheap and effective treatment for ME. Why worry about factual analysis when there is gold to be earned? Not of course, for one minute, do I believe that money and status would be a factor here. But what else could it be? An inability to understand the scientific process? A divine belief in these therapies that has no need of proof? It’s time we challenged the motives of those people who continue to refute the facts.