Skip to Main Content

The U.S. health care system is often inefficient, ineffective, and inequitable. Compared to other high-income countries, the U.S. pays more for health care and has worse outcomes. One potentially fruitful avenue for controlling costs and improving care is changing how care is paid for.

The federal Center for Medicare & Medicaid Innovation is tasked with developing and testing innovative payment models. One example is bundled payments for comprehensive care for joint replacements, in which hospitals receive a predetermined target price for an entire episode of hip or knee replacement care, including the surgery, hospital stay, and post-acute care like rehabilitation. Bundled payments encourage hospitals to control costs because the hospitals benefit financially if the cost of care comes in under the target — as long as they maintain quality — but remain on the hook for any spending over the target.

Randomized evaluations make it possible to rigorously evaluate the effects of these programs, much as randomized controlled trials are the standard for evaluating new drugs. Randomly assigning hospitals or doctors to participate in either a new program or the status quo ensures that there are no systematic differences between the groups at the beginning of a study. Differences in outcomes observed at the end of the study can then be attributed to the program and not to underlying factors.


The comprehensive care for joint replacement program began in 2016 as a randomized evaluation, with hospitals in 67 cities randomly selected and required to participate in the program. Researchers would then compare the Medicare spending, care utilization, and quality of care among patients in the program hospitals compared to patients treated in hospitals not in the program. In the face of political opposition, however, after two years the program was made voluntary for many hospitals. Such opposition, unfortunately, is common. In fact, despite CMMI’s mandate to evaluate, it often employs voluntary models, allowing doctors or hospitals to choose whether to participate in a new program.

Voluntary models can create two problems. First, providers can select the program that is most lucrative for them, not necessarily the one that is best for patients or for Medicare or Medicaid spending. Second, it makes potential reforms hard to evaluate, because those who participate may differ from those who don’t. Of the more than 50 payment models that CMMI launched between 2011 and March 2022, only five have been implemented as randomized evaluations.


These concerns aren’t new. But it is often difficult to demonstrate their real-world importance. Helpfully, a new analysis that leverages the cancellation of the comprehensive care for joint replacements program demonstrates exactly why randomized evaluations are so important.

The researchers — from Stanford University, MIT, Harvard University, and the National Bureau of Economic Research — compared hospitals that participated during the first two years of the program, when participation was mandatory, to hospitals that continued to participate after they were given the opportunity to opt out. The report offers a rare glimpse of selection bias in action, because the researchers were able to observe the program’s impact for all hospitals while the program was mandatory and then among those that chose to continue. The research was supported by J-PAL North America, which I work for.

When the program was mandatory, research (by the same team and others) demonstrated modest reductions in spending among the hospitals required to participate, driven by a decline in discharges to post-acute care like nursing facilities, with no effect on the quality of care or the number or composition of patients. Research also demonstrated similar effects for patients outside of the study treated by the same hospitals, a finding that would have been difficult to estimate credibly outside of a randomized evaluation.

When the program was made voluntary, the researchers observed that the hospitals that chose to stay in the program were the ones that would benefit financially from the bundled payment program even if they didn’t change their behavior, because they had already been billing under the target price for joint replacement before the program was instituted. Indeed, before the start of the study, these hospitals spent about $1,600 less per joint replacement than hospitals that opted out.

So when the program was made voluntary, it produced much smaller declines in Medicare spending than if the mandatory program had continued.

That won’t always be the case, because voluntary programs could also attract those hospitals and health care providers that are most able to change behavior.

In any case, evaluating a program in a self-selected setting is likely to produce biased estimates of what a scaled-up version will look like. The comprehensive care for joint replacement program adds to a growing body of evidence showing that programs evaluated under randomized and non-randomized designs yield disparate results.

Critics, such as the politicians who opposed requiring some hospitals to participate in the joint replacement program, argue that the government is experimenting on patients. But voluntary models give providers — not patients — important choices, and patients may be shifted toward options that provide the greatest reimbursement rather than the best care.

Fortunately, CMMI seems increasingly likely to adopt mandatory models going forward, allowing for rigorous evaluation. Despite a low percentage of mandatory models overall, five new mandatory models have been implemented since 2016. The center’s director, Elizabeth Fowler, has signaled that it will continue to focus on mandatory models under President Biden.

The Center for Medicare and Medicaid Innovation is in the rare position of being empowered to develop innovative payment models and prospectively evaluate them. Mandatory, national randomized evaluations ensure that the already significant effort to develop the models will yield the rigorous evidence needed to support decisions on whether to scale and adopt them broadly. Given the ineffective and inequitable status quo, policymakers should support and encourage this type of experimentation, not deprive the country of evidence needed to innovate and improve health care while simultaneously lowering cost.

Jesse Gubb is a research manager at J-PAL North America, a regional office of the Abdul Latif Jameel Poverty Action Lab, a research center at the Massachusetts Institute of Technology that aims to reduce poverty by ensuring that policy is informed by scientific evidence.

Create a display name to comment

This name will appear with your comment