That’s why a federal office known as the Agency for Health Care Policy and Research (AHCPR) impaneled a group of twenty-three experts in 1993 to draft guidelines to help doctors figure out how best to treat low back pain. The AHCPR was created in 1989 during the first Bush administration. Its mandate was to produce evidence-based, clinical-practice guidelines that would help physicians sort through the conflicting data that existed not just for low back pain, but for many other common treatments and tests. Then, as now, the nation’s medical bill was rising at an alarming rate, in part due to widespread, inappropriate use of unnecessary or useless treatments. Democrats and Republicans alike hoped that the AHCPR’s research would help rein in costs by giving doctors better direction, and offering payersespecially Medicarethe ammunition they needed to make evidence-based coverage decisions. More significantly, the agency promised to improve the quality of health care by helping to ensure that doctors would give patients the treatments they really neededand refrain from giving them care that could harm them.
But when the AHCPR’s panel concluded that there was little evidence to support surgery as a first-line treatment for low back pain, and that doctors and patients would be wise to try nonsurgical interventions first, back surgeons went wild. They knew that once the AHCPR’s guidelines were published, Medicare might limit reimbursement for various back surgeries to patients who were enrolled in a controlled clinical trial designed to test the efficacy of the procedure. If the study showed that a surgery was no better than nonsurgical remedies, or only about as good, there was a chance that Medicare would stop reimbursing for it. If Medicare made a back surgery provisional, private insurers were likely to follow.
Sensing a threat to their livelihoods, many surgeons bombarded Congress with letters contending that the agency’s panel was biased. One doctor, Neil Kahanovitz, founded the Center for Patient Advocacy, a nonprofit that orchestrated a sustained lobbying campaign against the entire agency. A company that manufactures pedicle screws (devices that are sometimes used during spinal fusion) sought a court injunction to prevent publication of the guidelines. The North American Spine Society, the main professional group for back surgeons, launched an assault on the methods used by the AHCPR experts, charging that the agency had wasted taxpayer dollars on the study.
Their arguments found a sympathetic ear in Newt Gingrich’s newly elected Republican majority in the House. The back surgeons’ anger at the AHCPR’s efforts to discipline medical practice resonated with the Republican fervor for reducing government, and with the party’s ideological antipathy for federal interference in what they imagined as a free market. The agency’s name soon appeared on a House Budget Committee “hit list” of 140 federal programs targeted for elimination. (The list also included the congressional Office of Technology Assessment, which evaluated the effectiveness of medical technology.) The Republicans saw the AHCPR as a wasteful government agency, and in 1995 the House voted to eliminate its funding, calling it the “Agency for High Cost Publications and Research.”
Eventually, the agency was rescued with the help of a handful of Republican supporters in the Senate, but it suffered a 21 percent cut of its already meager $159 million budget. Sensing the agency was still vulnerable, its director worked with moderate Senate Republicans to protect the agency by downshifting its mission. Now, the AHCPR would merely be a “clearinghouse” for data, which meant it could no longer offer Medicare explicit guidance when it came time to determine which tests, treatments, and procedures to cover. The word “policy,” which smacked of the failed Clinton health care plan, was expunged from its name, and the AHCPR became the Agency for Healthcare Research and Quality (AHRQ).
The back panel’s guidelines were published in 1994, but they were ignored by many surgeons who were perhaps emboldened by the Republican smack-down of the AHCPR. Last year, we spent more than $16 billion on back surgeries, and, in the past decade, surgeons have been performing spinal fusions at a furious rate, even though there still has never been a rigorous, independently funded clinical trial showing that going under the knife is superior to cheaper, less invasive remedies. At the same time, the nation’s total health care bill continues to skyrocket, propelled in no small measure by procedures that are equally as questionable as spinal fusion. In 2000, America spent $1.3 trillion, a figure that nearly doubled to an estimated $2.1 trillion by 2006. In the view of Peter Orzag, head of the Congressional Budget Office, this has put the U.S. on “an unsustainable fiscal path.”
Of course, some of our money is going toward new treatments and tests that help Americans live longer and healthier lives. However, as much as 30 cents on every health care dollar is spent on unnecessary careor “overtreatment,” in medicalspeak. That may sound odd after all we’ve heard from people like Michael Moore about how everybody from your hospital to your insurer is getting rich by denying you care you need. Yet both problems exist simultaneously. All too often, patients don’t get necessary medical treatment. At the same time, we risk being given stuff that not only doesn’t improve our health but which may actually harm us. One estimate suggests that as many as 30,000 Medicare recipients die prematurely each year from unnecessary care.
This overtreatment is due in part to an excess supply of medical resources-hospital beds, intensive care units, specialists, CT scannersin many parts of the country. But it is also the result of our national failure to fund the research that could show what works in medicine, what doesn’t, and for which patientsand then to train doctors to understand that research and use it. Our current fee-for-service payment system, which pays hospitals and doctors for each hospitalization, office visit, procedure, test, and surgery performed, simply gives providers an incentive to adopt anything that’s well reimbursed, regardless of whether it actually helps patients. Medicare pays for practically anything that physicians deem “medically necessary,” much of which, from spinal fusion to a fancy new imaging scan for Alzheimer’s, remains unproven by anything resembling good scientific evidence. We’ve been set back a decade in reforming this system, thanks in part to the handiwork of Gingrich’s House.
With the Democrats back in power in Congress, there’s bipartisan talk once again of resurrecting an agency that would do much of what the AHCPR was supposed to accomplish before its brush with death. The three leading Democratic presidential candidates have put evidence-based medicine at the center of their health care proposals, along with the need for what Mark McClellan, former secretary of Health and Human Services, calls a “fee-for-value” payment system, rather than the current fee-for-service system. The House recently approved a bill to expand the State Children’s Health Insurance Program (SCHIP) to cover uninsured children that includes a provision for funding comparative effectiveness researcha bill that President Bush has vowed to veto. Meanwhile, Americans are paying an estimated $400 billion to $700 billion a year for medical care that isn’t doing them any good.
Well, not exactly. For starters, there is surprisingly little government oversight of medical practice. The Food and Drug Administration, which many people imagine oversees it, in fact only regulates the marketing of drugs and devices. Before a drug can get on the market, its manufacturer must generally demonstrate that it is relatively safe and more effective than a placebo to alleviate a narrow set of conditions, or “indication.”
Once a drug is approved by the agency, however, physicians can prescribe it for any condition or ailment they deem appropriate. This is called “off-label” use. For many drugs, off-label prescriptions for conditions that were not included in the studies that led to FDA approval make up the majority of sales. Most of the time, this is perfectly fine. Few antibiotics have been approved for use in children, but nobody would fault a pediatrician who prescribed one off-label for a child with bronchitis. The FDA has no say in this aspect of medical practice except to restrain drug companies from marketing off-label uses of their products. That means that Merck couldn’t advertise Vioxxa painkiller originally approved to treat rheumatoid arthritisto treat tennis elbow, but doctors were free to prescribe it to anybody they pleased. By the time Vioxx was pulled from the market, it was estimated to have killed more Americans than were killed in the Korean War.
When it comes to medical procedures, the FDA has zero authority to make sure they actually work. If your surgeon wants to try removing your appendix through your back, that’s between you and your surgeon and the hospital. And if other surgeons think that sounds like a promising idea, nobody oversees what they do except their fellow surgeons. This might sound weird, but it’s the way medicine has worked for millennia. To develop new treatments, pioneering doctors and scientists often have to take risks. For instance, in the 1960s, Christiaan Barnard thought it might be possible to take the heart from a recently deceased patient and let that heart give new life to anothera radical idea at the time. He was right, and the practice has continued. Of course, a lot of patients die along the path toward perfecting a new procedure, but more significantly, there’s no systematic effort made to ensure a new technique is actually an improvement over the old ones. Medical devices like pedicle screws and implantable hips and pacemakers lie somewhere between the regulation afforded by the FDA over drugs and the lack of oversight of medical practice.
What this means for both doctors and patients is that there is little reliable information about most things doctors do. The FDA does not require that a new drug be an improvement over other medicines that are already on the market, and the drug industry does not routinely conduct valid (translation: likely to be true) trials that compare one drug to another. When it does fund such comparative effectiveness trials, they are often so woefully biased that the results are meaningless; the drug manufactured by the funder of the study generally comes out on top. And the drug industry rarely, if ever, funds studies examining whether its products are superior to nonpharmaceutical forms of treatment antidepressants versus therapy, for instance.
You can’t blame them. Good clinical trials can cost millions of dollars to conduct and take years to complete, and the pharmaceutical industry is in the business of selling drugs, not increasing medical knowledge for the betterment of society. The job of ensuring that medical practices are actually helping patients should fall to a scientifically credible, disinterested body, an institution that might look a lot like the AHCPR would now if not for its near-death experience in 1996.
The National Institutes of Health and the Veterans Health Administration can generally be relied upon to fund well-designed trials, but the trick is getting them to do them. Though the VHA conducts some of the best clinical research around, the agency is woefully underfunded and can’t be expected to accomplish very much. The NIH, on the other hand, with its nearly $30 billion budget, is in a position to systematically work through the research that’s necessary to improve the practice of medicine. But the institutes devote only a small fraction of their budgets to comparative effectiveness studies. That’s largely an effect of NIH institutional culture, which is focused on the basic science of disease and searching for sexy new cures, and less inclined toward the more humdrum science of comparative effectiveness: as the NIH director recently remarked, “We don’t do Coke versus Pepsi.” Consequently, getting the NIH to fund a comparative effectiveness study can be an arduous task for concerned physicians and patient advocacy groups, who often must lean on the institutes, sometimes for years, before trials are mounted.
Among the few comparative effectiveness trials NIH has funded, several have shown the surgical procedure or drug or test in question to be less miraculous than doctors and patients believed. For example, surgeons long assumed that a radical mastectomy for breast cancer, removing not just the breast but the underlying chest muscle and the lymph nodes under the arm, was the only way to get every last cancer cell. Then a massive, multimillion-dollar clinical trial launched by the NIH in the 1990s found that lumpectomy with radiation was just as effective, not to mention less traumatic for many women. Many patients and doctors also fervently believed that high-dose chemotherapy was a woman’s best hope when she had advanced breast cancer. The brutal regimen was used for twenty years before clinical trials finally demonstrated that it was no more effective than standard, far less punishing doses of chemo. During those twenty years, an estimated 9,000 women were killed not by their cancer, but by the high-dose treatment. And the most recent example of a dramatic upset: a VA study, released earlier this year, which found that angioplasty and cardiac stents are no better at preventing a heart attack than so-called “medical management”that is, getting patients with risk factors for heart disease to lose weight, exercise, and take medicines to lower their high blood pressure and high cholesterol.
There are two conclusions to be drawn from all of this. One, doctors are making a lot of decisions about how to treat their patients without the benefit of data. One day medical historians will look back at many current medical practices and see twenty-first-century equivalents of bloodletting and leeches. According to the Institute of Medicine, perhaps half of medical practice is based on valid evidence. Dr. David Eddy, an expert in the field of medical evidence, thinks as little as 15 percent of what doctors do is based in good science. Or, as he recently told BusinessWeek, “The problem is, we don’t know what we’re doing.”
Two, it seems completely crazy for a country that spends so much on health care to spend so little on systematically filling the gaps in medical knowledge. Of our more than $2 trillion national health care bill, we devote less than one-tenth of 1 percent to answering the myriad questions about what actually works in medicine. What’s the best way to get people to lose weight and exercise in order to prevent heart disease and diabetes? Nobody knows. Is a cesarean section necessary if a woman’s previous child was delivered by cesarean? Can a million-dollar da Vinci surgical robot, touted by many hospitals that have purchased the device, really improve outcomes, or is it just a fancy way to spend money? If a man has prostate cancer, which remedy is best? There are four different surgeries, several types of implantable radioactive seeds, and multiple external radiation regimens to choose from. Macular degeneration, a disease that causes blindness in 200,000 Americans each year, can be treated with one of two drugs, Lucentis or Avastin, but there’s no head-to-head evidence to show which one is better, or which one is best for a particular patient.
In use since the 1950s, spinal fusion surgery can offer tremendous relief to patients suffering from spinal fractures and tumors of the spine, where vertebrae have been displaced and threaten to crush the spinal cord. But such cases make up only a tiny fraction of the more than 300,000 spinal fusions that are performed each year. Most are done on patients with “simple low back pain,” in medical parlance, even though there’s nothing simple about either the diagnosis or the treatment. About 85 percent of patients with low back pain can’t be given a precise diagnosis, says Dr. Richard Deyo, a back expert at the University of Washington in Seattle and coauthor of Hope or Hype: The Obsession With Medical Advances and the High Cost of False Promises. But that doesn’t stop doctors who have trained in different disciplines from viewing back pain through their own diagnostic lens. In a paper titled “Who You See Is What You Get,” Deyo, who was one of the members of the AHCPR panel that looked at back treatment in the 1990s, and his coauthors found that rheumatologists tended to give patients with low back pain blood tests, to look for rare immunological disorders. Neurologists performed tests of how well nerves conducted impulses along the spine. Surgeons ordered MRIs and CT scans, which show the anatomy of the bones and soft tissue in the back.
When an MRI or a CT scan shows a degenerated or worn-out disk, fusion surgery is often the recommended treatment. The logic seems pretty obvious. You can think of a disk in your back as resembling a doughnut, which gets compressed and squeezed between the vertebrae over time. A worn-out disc may cause abnormal movement between vertebrae that generates pain. To fix the problem, the surgeon removes the offending disk, then fills the gap left between the two vertebrae with bone chips harvested from cadavers or the patient’s own hip. The chips are intended to “fuse” the vertebrae so they don’t collapse together and squeeze the spinal cord. Sometimes surgeons use pedicle screws, a gizmo that is attached to the two vertebrae above and below the space emptied by the removed disk, in order to hold them in place.
But it turns out that a worn-out disk may not necessarily be the source of the person’s pain. Studies of CT scans have shown that 20 percent of people under the age of forty have a degenerated disk but don’t have any back pain. A study using MRIs found that 36 percent of people over sixty had herniated disks, and more than 80 percent showed disk degeneration (bulges and narrow bits), but had no significant back pain. Research has found that the vast majority of people with back pain will recover within a few weeks if they take anti-inflammatory pain medication like ibuprofen, rest for a short period, and get physical therapy.
Even for patients who don’t get better on their own, spinal fusion may not be the best remedy. In the past, not one of the dozens of studies that looked at the procedure definitively showed which patients are likely to be helped, and whether or not surgery is as at least as good, if not superior, to nonsurgical treatment like physical therapy and pain medication to keep the patient comfortable while the back heals on its own. This is because the right kind of trial comparing surgery to other remedies had never been performed.
Never, that is, until earlier this year, when Dr. James Weinstein, the chair of orthopedic surgery at Dartmouth Medical School, released results from the Spine Patient Outcomes Research Trial, or SPORT, which was sponsored by the NIH. The best-designed and best-executed study of back surgery ever conducted, SPORT still can’t say for sure that surgery is significantly better than nonsurgical remedies for all low back pain. In part, this is because Weinstein’s study only looked at one condition that causes back pain, something called spondylolisthesis. The study didn’t attempt to address the most common reason surgeons do spinal fusions, degenerated disks. For spondylolisthesis, SPORT suggests that spinal fusion might be a little better than nonsurgical remedies. Studies of spinal fusion that have looked at degenerated disks, many of them from Europe, can’t say much. “Even if you take all of the research at face value, you have to conclude that spinal fusion is only modestly effective,” says Rick Deyo. “It’s not a slam dunk.” Back pain patients would probably do well to think about that, because the surgery they’re about to undergo poses real risks, including infection, continuing pain, pseudoarthrosis, a condition called “failed back syndrome,” and occasionally even death.
Between 1997three years after the AHCPR’s guidelines were publishedand 2006, the number of spinal fusions went up 127 percent, from a little more than 100,000 a year to 303,000 annually. But if surgeons don’t know which patients are most likely to benefit, or even that surgery is going to be much better than what Weinstein calls “the tincture of time,” why have they been performing more spinal fusions, only a small fraction of which are on patients with conditions that have been shown to be likely to benefit? Well, for starters, the surgery is one of the most lucrative procedures in medicine, for both hospitals and physiciansaccording to the AHRQ, median hospital charges are $42,000 per case. The surgeon’s reimbursement from Medicare is about $4,000, and private insurers may pay more. Back surgeons can easily earn $1 million to $2 million a year. We spend an additional $2.5 billion annually on fusion hardware like pedicle screws, which can add thousands of dollars to the price of a surgery. Yet there’s practically no evidence to show all those screws improve outcomes either.
But money is only one of the factors driving the large number of fusion surgeries. Patients want surgery, often because a friend found relief from it, or because they think their excruciating pain can’t possibly be cured without doing something drastic. And many surgeons believe in their heart of hearts that they are doing God’s work relieving pain, secure in their genuine conviction that research supports them.
Stuart spent twenty-five years as a family practice physician at Group Health Cooperative, in Seattle, Washington, before he quit to launch a consulting company consisting of himself and Sheri Ann Strite, a former research coordinator at Group Health with extensive experience critiquing clinical research studies. Calling themselves “clinical usefulness detectives,” the pair teach doctors how to think more analytically about medical science. In 2006, they spent three days with a group of back surgeons in Boise, Idaho, the spinal fusion capital of the United States. In 2003, surgeons in Boise performed 8.2 back surgeries per 1,000 Medicare enrollees in the surrounding population, over twice the national average. “I think the whole mountain region has a lot of orthopedists and spine surgeons,” says Douglas Dammrose, senior vice president and medical director at BlueCross Idaho. “They’re outdoorsy types. They like to ski.”
Stuart and Strite led a group of spine surgeons and physiatrists (doctors who specialize in physical therapy) through a course intended to teach them how to pick apart medical research. They started the group out easy, with studies that were clearly biased or that did not include enough patients to come to a statistically significant result. The surgeons learned quickly, and by the third day were eager to examine the studies they themselves chose as the best evidence available for spinal fusion surgery.
Not one of the studies, it turns out, was scientifically rigorous enough to be considered valid evidence that spinal fusion is superior to nonsurgical remedies. (The SPORT trial had not yet been reported.) “It was quite an eye-opening experience for them,” says Stuart. In one of the studies the surgeons chose, a reanalysis of the data showed that fifty-four out of 109 patients who got nonsurgical treatment felt better after one year, as compared to fifty-five out of 109 patients who felt better after receiving surgerynot exactly a ringing endorsement.
After dissecting all of their studies, the surgeons sat in stunned silence. “Somebody said, ‘Wow, we’ve gone through everything we’ve relied on,’” Stuart recalls. One doctor threw up his hands. It seemed to Stuart and Strite as if the physicians, in recognizing that the evidence for their craft was lacking, were going through the stages of grief. One surgeon seemed sad. Another said the studies had to be wronghe just knew he was relieving his patients’ pain. Later, a doctor confronted Stuart in the hall, saying angrily that he didn’t have time to read the journals, and it was unreasonable to expect physicians to spend their weekends critically evaluating the literature.
All of which points to the need for a national strategy for improving the evidence base of medicine. We need an independant agency that would fund systematic reviews of the medical literature, as well as clinical trials to test the comparative effectiveness of everything from drugs to treatments. An agency that could help Medicare and other payers know what to cover, and what’s still experimental. An agency, in short, that would look a lot like the AHCPR probably would today if it hadn’t been derailed in 1996.
It doesn’t much matter who does it, as long as the job gets done. It could be a new institute, as Senators Barack Obama and Hillary Clinton have called for. The NIH could take it on, provided the director could be persuaded that testing existing treatments is as important as finding new cures. Or, it could be a beefed-up version of the AHRQ. For argument’s sake, let’s give it a new namethe Agency for Clinical Effectiveness, or ACE.
The first thing such an agency would need is independence to render sound medical judgment without political interference. Just this August, the Washington Post reported that the Department of Health and Human Services, under heavy pressure from the infant formula industry, had buried the AHRQ’s comprehensive finding that breast-feeding leads to better health in babies. To prevent this kind of thing from happening, the ACE would need the kind of political insulation afforded the Federal Reserve Board, which is able to set sound monetary policy partly because it is largely immune from congressional pressure. One way to do that would be to create an advisory board of three to six members whose presidential appointments lasted six years and ended in a staggered fashion, so no one president could clean house. No more than a third of the board should come from industry. The board would help the agency set research priorities for performing appraisals of existing data and for funding new clinical studies.
Such studies wouldn’t come cheap. But there is already a plan on the table to pay for them. A bipartisan House bill proposed in June would raise $3 billion in federal and private money to fund comparative effectiveness trials over the next five years. This would amount to less than three-hundredths of 1 percent of what we are expected to spend on health care in that same interval.
The agency would have an immediate effect on the quality of health care simply by serving as a source of valid clinical data, and making its research findings readily accessible to payers, doctors, and the public. In order to start bringing down costs, however, the ACE would also need the power to make recommendations to Medicare about what to reimburse. On the basis of current evidence for spinal fusion, for example, the ACE might advise Medicare to pay for surgeries that are clearly needed, as in cases of spinal tumors, and to reimburse provisionally only after a patient has failed to get relief from nonsurgical treatment.
Such research won’t be greeted with open arms. Specialists that make money off expensive procedures may not welcome research that could hurt their incomes; many people on both the right and the left view any effort by government to oversee medical practice as nothing more than a poorly disguised attempt to save money by denying care.
But the most sustained resistance to establishing the ACE is likely to come from device and drug makers. Already, medical manufacturers are quietly trying to undermine provisions in the SCHIP bill to fund comparative effectiveness research. Their argument goes something like this: the high profits we earn today are the engine that drives the cures of tomorrow, and anything that threatens those profits could kill medical innovation.
But is this true? Other industries, like high tech, manage to remain innovative on far lower margins than the 20 percent and 30 percent enjoyed by, respectively, the pharmaceutical industry and the implantable device makers. It’s hard to imagine that better information would seriously impair either industry, though it will undoubtedly hurt some manufacturers and might even put some small device makers out of businessfreeing up capital for the creation of new companies and better products.
The most likely result of technology assessment and comparative effectiveness research is that it will bring about a more efficient market in medicine. Right now, most devices and drugs command exorbitant prices in the U.S., regardless of their actual benefit to our health. And many procedures and tests are being reimbursed without evidence that they are doing patients any good. If an agency such as the ACE were empowered to produce meaningful research, we would no longer wind up paying top dollar for medical technologies and treatments that aren’t significantly better than cheaper alternativesor, to put it more simply, doing the equivalent of paying Mercedes-Benz prices for a Chevy. That’s an argument that even Newt Gingrich ought to agree with.