Miscounting Disease Management Outcomes
by Scott MacStravic
There are literally hundreds of published reports of the results of disease management investments available for interested readers. It is not unusual for them to cite ROI ratios of 2:1, 3:1, 5:1 and even higher, making it seem odd that the federal government keeps coming up with mixed, inconsistent or inadequate evidence in its frequent studies of DM’s cost effectiveness. One of the severe handicaps limiting evaluation findings is the fact that they look too early for the results they want.
Their time myopia, reflected in reports that often address results in only one or two years, is somewhat understandable, considering the high cost of many of the federal demonstration projects. Congress and its constituents alike want to know if they are getting anything for their money, and when they are getting it. But given the necessity of achieving significant changes in the behavior of patients suffering from the diseases being managed, it should not be a surprise that results take a while to be fully realized.
The disease and health management vendor Gordian Health Solutions has reported that on average, its clients’ ROI ratios have been 1.69:1 in the first, then 2.00:1 in the second, and 2.46:1 in the third year. [“When It Makes Cents to Back into the 80/20 Rule” Gordian Health Solutions Mar 2004] If results in general tend to be almost half again as great in the third year as they are in the first, then judging results after only one or two years can clearly produce erroneous judgments.
GlaxoSmithKline followed over 6,000 of its employees who participated in its Contract for Wellness program for four years. The savings it reported over this period averaged only $233 in the first, $375 in the second, then jumped to $944 in the third and $950 in the fourth year. [G. Stave “Quantifiable Impact of the Contract for Health and Wellness” JOEM 45:2 Feb 2003 109-117] This report is one of the few examples of a cohort analysis, whereas most results reported come from a mix of people in terms of how long they have individually been participating in a DM initiative.
When faced with such myopia with respect to time, it is perhaps understandable that there has long been a tendency for DM providers, and even its customers, to fall into one or more evaluation “traps” when looking for early results. The most common is based “regression to the mean”, the tendency for outlier examples of medical care expenses to fall back toward the mean in the next year following their occurrence, merely because people do not usually persist in very high levels of costs year after year.
When targets for DM efforts are selected specifically because of their high levels of expense in a given year, the chances that their expense levels will fall in the next year are quite high, regardless of whether or not they are being “managed”, so a substantial portion of the apparent “savings” in that next year would be a natural statistical phenomenon, rather the result of any intervention. If evaluations do not control for this natural tendency, they are very likely to overstate the portion of differences noted in the first vs. the baseline year that are truly caused by DM interventions.
The second most common trap reflected in DM results reports is what is called “self-selection bias”. Whenever enrollment in a DM intervention is open to all and purely voluntary, it is likely that those who volunteer will be significantly more receptive to DM support, more concerned about their health, more likely to already adhering to medications and lifestyle regimens prescribed by their physicians than the overall average for all those with the same disease. If evaluations simply compare the medical costs for participants to the costs for non-participants, the differences noted will include a lot of the differences that pre-existed the DM intervention, and are not caused by it.
Those who invest in DM efforts are as much at risk as those who provide such efforts. The decision makers are at risk for having made a mistake, with all the psychological, social, and career damage such an error could cause. Unless they are willing to insist on rigorous evaluation, they may latch on to any evaluation that saves them from looking bad, regardless of any weaknesses and errors in calculations. And vendors are at just as much risk, with many having guaranteed positive results, so they suffer from the same tendency.
This makes it understandable that the federal studies employ far more rigorous evaluation approaches than do most health plans or employers. They often insist on “controlled studies” where comparisons are not between self-selected cohorts, but between randomly assigned participants and non-participants in the DM effort. Or they will create a predicted cost for years after the baseline year, rather than rely on before vs. after comparisons.
Rigorous evaluations take longer and cost significantly more than the simple approaches too often used, adding to the risk that simple approaches will be used, and that regression to the mean and self-selection bias will significantly distort findings. It is usually best to hire an independent and expert source for such evaluations, however, since the wrong judgments can either deprive sponsors of worthy investments, or lead them to persist in unworthy ones.





