Research Article | Volume 20 Issue 1 (Jan-Dec, 2015) | Pages 1 - 5
Is the Devil in the Detail? Autopsies of Neutral Heart Failure Trials
Author affiliations: Department of Medical Sciences, Cardiology, Uppsala University, Uppsala, Sweden; Uppsala Clínical Research Center, Uppsala University, Uppsala, Sweden Address for correspondence: Ola Vedin, Uppsala Clinical Research Center/MTC, Dag Hammarskjölds väg 148, 75237 Uppsata, Sweden
Under a Creative Commons license
Open Access
Jan. 1, 2017
May 5, 2017
July 8, 2017
Dec. 31, 2017

The 14th Global CardioVascular Clinical Trialists Forum was attended by rep- resentatives from academia, industry, regulatory authorities, and patients to discuss contemporary cardiovascular outcome trials. In a session titled "Positive signals from recent neutral heart failure trials: time for an autopsy," which was chaired by Mona Fiuzat (US) and Christopher O'Connor (US), investigators from several of the most discussed HF trials in recent years gave their viewpoints on why the trials did not render a positive result. The lessons learned encompassed most aspects of a clinical trial and the absolute necessity of complete control and oversight of the trial to avoid disappointments further down the line was emphasized.


The predefined inclusion and exclusion criteria of a clinical trial serve to ensure that the included patients are those presumed to benefit the most from the studied therapy and to establish adequate safety. In RELAX-AHF2, patients with acute HF were randomized to treatment with the vasodilator serelaxin or placebo. The primary endpoint, a composite of cardiovascular death at 6 months and worsening HF within 5 days, was not met. Michael Felker (US), one of the investigators, stated that, while the intended population as defined in the protocol was the one actually enrolled, safety considerations regarding potential study treatment effects on blood pressure had prompted inclusion criteria of a blood pressure≥125 mm Hg before randomization in the study. It is well known that, in acute HF, patients with higher blood pressure fare considerably better than those who are hypotensive, which is, in part, a reflection of different degrees of hemodynamic impairment and disease severity. Thus, the relatively high blood pressure cut-off will result in a study population with a relatively good prognosis and in whom a treatment effect is more difficult to establish. This assumption was also confirmed by the relatively low event rates in the study (8.7% vs 8.9% cardiovascu- lar mortality at 180 days in the serelaxin vs placebo group, respectively). 


In TOPCAT, 3 445 patients with HFPEF were randomized to treatment with spironolactone or placebo. In the overall population, the primary end point of cardiovascular death, HF hospitalization, and aborted cardiac arrest was not met. However, in a post hoc analysis on regional differences, the investigators found a 4-fold difference in event rates between patients in the Americas (the US, Canada, Argentina, and Brazil) compared with patients in Russia/Georgia. Patients in Russia/Georgia had low mortality rates close to that of the general population and much lower than observed in HF whereas patients in the Americas had high event rates, as expected in the enrolled HF population. This, in combination with the observed lower levels of canrenone, a metabolite of spironolactone, at the 12-month study visit in patients from Russia compared with the US and Canada, led the investigators to believe that a significant proportion of patients from Russia/Georgia did not actually have HF and further, may not actually have taken the study drug.


The subgroup findings from TOPCAT have been known for quite some time, but the presentation by the primary investigator Bertram Pitt (US) nevertheless sparked a lot of discussion and further explanation of the observations. For instance, while the regional differences were indeed considerable, on closer inspection, the underlying problem appeared more related to the site than to strict geographic disparities, with many sites in Russia/Georgia displaying similar patient characteristics and outcomes as those in the rest of the world, but a few sites with very high recruitment producing the differences. Pitt stated that, while we are careful to ascertain and adjudicate HF-related outcomes in our trials, perhaps we should take more care to adjudicate the baseline diagnosis itself. 25


In TOPCAT, enrolled patients were stratified based on the inclusion criterion of either a previous HF hospitalization or elevated levels of natriuretic peptides. The majority of those enrolled in Russia/Georgia were included based on the criterion of a prior HF hospitalization, often without information about natriuretic peptide levels, and consequently with a presumably less reliable HF diagnosis. Further, in an additional remark by Nancy Geller (US), director of biostatistics at NHLBI, the importance of equal costs attributed to different strata was highlighted. In the example of TOPCAT, a higher cost associated with the measurement of NT-proBNP could have been a factor in the low proportion of patients included based on this criterion in Russia/Georgia, which, in turn, could have contributed to the unfortunate discrepancy in enrollment.


Equally important to selecting and enrolling the right study population is the ensuing study conduct and appropriate delivery of the intervention for the success of a trial. The possible consequences of diverging from the specific instructions for the study intervention were reflected upon by Milton Packer (US), primary investigator in TRUE-AHF. In the study, patients admitted with acute HF were randomized to a 48-hour infusion with ularitide or placebo, but the study was neutral with respect to the coprimary endpoint, which consisted of cardiovascular death throughout the trial and a complex measure of clinical status in the first 48 hours after the intervention. In an effort to avoid ambiguity as to the effects of the study intervention vis-à-vis other aspects of acute HF management, investigators were specifically instructed to refrain from making any changes to other treatments in the 2 hours before and 4 hours after the study drug administration. However, in 17% of patients, these predefined stability criteria were not met. In a subsequent analysis, ularitide was found to be superior among the remaining 83% in whom the stability criteria were satisfied, although the overall trial result was neutral. Packer pointed out that a significant proportion of the study population (772 out of 2 157 patients) were recruited from a few high recruitment sites and that it was among these sites that the protocol deviation was most common. 


In SOCRATES-PRESERVED, a safety and dose-finding study that evaluated vericiguat (1.25 mg, 2.5 mg, 5 mg, and 10 mg daily) vs placebo, the primary end point of change in NT-proBNP and left atrial volume at 12 weeks was not met. Javed Butler (US) gave one explanation that could have contributed to a neutral result by reducing the study's statistical power. An erroneous software update in the drug dispensation system during the study resulted in 48 patients, who were randomized to the two highest doses, receiving lower doses than intended and they were consequently excluded from the final analysis. Therefore, the potential effects of the higher doses of vericiguat could have been more difficult to detect.


Many neutral outcomes in trials, however, do not arise from flawed patient selection, enrollment, or protocol adherence, but are the fundamental result of an incorrect hypothesis. Unfounded assumptions, unrealistic effect size projections, failures of logic, and insufficient pathophysiological understanding are not uncommon in contemporary clinical trials and were discussed by the investigators.


BLAST-AHF was designed to determine the optimal dose and safety of an IV infusion of TRV027, a selective angiotensin II type 1 receptor-biased ligand, in acute HF. Although the treatment appeared safe, there was no effect on the primary end point. Peter Pang (US), primary investigator in BLAST-AHF, questioned whether the hypothesis that provided the rationale for the study might not have been adequate, namely that neurohormonal activation is as viable a treatment target in acute HF as it has proven to be in chronic HF. Another example came from SERVE-HF, in which the concept of adaptive servo ventilation for central sleep apnea in HFREF was tested based on previous observations of associations between central sleep apnea and an adverse prognosis. The primary end point was not met and there was even an increase in cardiovascular and all-cause mortality in the treatment arm, which has since resulted in extensive discussion and further research. Faiez Zannad (FR) highlighted the need for more detailed pathophysiological knowledge and better-founded hypotheses in general when embarking on new large-scale clinical trials, mere associations rarely suffice. Even if studied treatments may indeed have potential and a plausible rationale, beneficial effects could go undetected if the primary end point is disproportionate to what the treatment can be expected to achieve. In the case of RELAX-AHF2 and TRUE-AHF, both testing a 48-hour infusion, there were evident hemodynamic effects, which, however, diminished after the infusion was stopped. Both studies used coprimary end points that included cardiovascular death at 6 months or throughout the trial, which may have been far too ambitious. In other cases, new pathophysiological knowledge may come to light during or after the trial that, had it been known earlier, could possibly have altered the course of the trial. Michael Bristow (US) recounted an example from the somewhat older BEST trial that tested the B-blocker bucindolol vs placebo in advanced HF. No overall effect on the primary end point was observed, but there was an apparent interaction effect for race and treatment reflecting a lack of benefit in black patients. Later research suggested that genetic polymorphisms in black patients could explain the interaction and had this been known at the time of the trial, a different outcome may have been achieved.


In some trials, the design itself, or, more specifically, the handling of the intervention and the comparator, may prove most challenging. Michael Felker (US) emphasized the difficulty of running a strategy trial, in which the intervention is a disease-management strategy rather than a pharmacological or device intervention. In GUIDE-IT, HF patients were randomized to either NT-proBNP-guided therapy or usual care with a primary composite end point of cardiovascular mortality or HF hospitalization. Felker stated that, in the trial, both the intervention and comparator group received excellent care and, although the NT-proBNP arm received a few more pharmacological interventions, this did not amount to any differences in outcomes. Patients in a clinical trial are generally well taken care of with respect to treatment and follow-up, which can make extraordinary demands with respect to the power of the intervention if any benefit is to be demonstrated. Another example of a challenging strategy trial was made by Dave Whellan (US), primary investigator in ACTION-HF. The trial set out to evaluate the efficacy and safety of aerobic training among 2 331 stable HF outpatients. The intervention was both complex and ambitious, consisting of an initial supervised 36 sessions followed by home-based exercise. In the primary protocol-specified analysis, the intervention did not achieve a statistically significant reduction in the primary composite end point of all-cause mortality and hospitalization. In a supplementary prespecified analysis adjusting for highly prognostic baseline characteristics, the intervention did achieve a modest risk reduction. Whellan proposed that one explanation for the modest result could have been the difficulty of transitioning patients from the supervised to the home-based exercise phase. In the home-based phase, adherence to the training program decreased considerably, which could have lessened the impact of the intervention on outcomes, ultimately signaling that the trial and, more specifically, the intervention, may have been too ambitious.


During the session, only one of the trialists on the panel conceded head-on feat. Karl Swedberg (SE), the primary investigator in RED-HF, a trial testing darbepoetin alpha vs placebo in anemic HF patients, said that the trial was well designed and that there were no other apparent reasons for its neutral outcome than the treatment simply being ineffective. As disappointing as a neutral result can be, Swedberg pointed out that such a result is equally important and that burying one concept will allow us to move on to others that show promise. Finally, in the discussion between the panelists and the audience, a few key learning points recurred several times over the course of the session. Both Pitt and Packer emphasized the importance of a close relationship between the trial steering committee, the involved contract research organizations, the data safety monitoring committee, and the sponsor. In the cases of TOPCAT and TRUE-HF, both investigators concluded that more efficient communication between the trial stakeholders could perhaps have allowed for a swift intervention correcting ongoing mistakes in trial inclusion and conduct. In parallel, effective and close monitoring of sites. to ensure the quality of the data was stressed by several investigators and a rapid enrollment clearly can compromise quality. Felker proposed a solution to the problem and called for a system change with more focus on sites providing high-quality data rather than only incentivizing patient enrollment.

Recommended Articles
How do gender differences affect cardiovascular risk factors
Research Article
The First Results from the OPTIMIZE Heart Failure Care Program
Published: 31/12/2017
Perfusion cardiovascular magnetic resonance : will it replace SPECT?
Research Article
Update on the ESC Eurobservational Research Programme Registries
Published: 31/12/2017
Chat on WhatsApp
Copyright © Dialogues in Cardiovascular Medicine untill unless otherwise stated