Background. Dehydration from viral gastroenteritis is a significant pediatric health problem. Oral rehydration therapy (ORT) is recommended as first-line therapy for both mildly and moderately dehydrated children; however, three quarters of pediatric emergency medicine physicians who are very familiar with the American Academy of Pediatrics recommendations for ORT still use intravenous fluid therapy (IVF) for moderately dehydrated children.
Objective. To test the hypothesis that the failure rate of ORT would not be >5% greater than the failure rate of IVF. Secondary hypotheses were that patients in the ORT group will (1) require less time initiating therapy, (2) show more improvement after 2 hours of therapy, (3) have fewer hospitalizations, and (4) prefer ORT for future episodes of dehydration.
Methods. A randomized, controlled clinical trial (noninferiority study design) was performed in the emergency department of an urban children’s hospital from December 2001 to April 2003. Children 8 weeks to 3 years old were eligible if they were moderately dehydrated, based on a validated 10-point score, from viral gastroenteritis. Patients were randomized to receive either ORT or IVF during the 4-hour study. Treating physicians were masked and assessed all patients before randomization at 2 and 4 hours of therapy. Successful rehydration at 4 hours was defined as resolution of moderate dehydration, production of urine, weight gain, and the absence severe emesis (≥5 mL/kg).
Results. Seventy-three patients were enrolled in the study: 36 were randomized to ORT and 37 were randomized to IVF. Baseline dehydration scores and the number of prior episodes of emesis and diarrhea were similar in the 2 groups. ORT demonstrated noninferiority for the main outcome measure and was found to be favorable with secondary outcomes. Half of both the ORT and IVF groups were rehydrated successfully at 4 hours (difference: −1.2%; 95% confidence interval [CI]: −24.0% to 21.6%). The time required to initiate therapy was less in the ORT group at 19.9 minutes from randomization, compared with 41.2 minutes for the IVF group (difference: −21.2 minutes; 95% CI: −10.3 to −32.1 minutes). There was no difference in the improvement of the dehydration score at 2 hours between the 2 groups (78.8% ORT vs 80% IVF; difference: −1.2%; 95% CI: −20.5% to 18%). Less than one third of the ORT group required hospitalization, whereas almost half of the IVF group was hospitalized (30.6% vs 48.7%, respectively; difference: −18.1%; 95% CI: −40.1% to 4.0%). Patients who received ORT were as likely as those who received IVF to prefer the same therapy for the next episode of gastroenteritis (61.3% vs 51.4%, respectively; difference: 9.9%; 95% CI: −14% to 33.7%).
Conclusions. This trial demonstrated that ORT is as effective as IVF for rehydration of moderately dehydrated children due to gastroenteritis in the emergency department. ORT demonstrated noninferiority for successful rehydration at 4 hours and hospitalization rate. Additionally, therapy was initiated more quickly for ORT patients. ORT seems to be a preferred treatment option for patients with moderate dehydration from gastroenteritis.
To the editor,
The article by Spandorfer et al. (1) about rehydration was reviewed during a residents’ journal club. Following an interesting discussion, it was felt that a few methodological issues may need to be addressed.
Attention to details of design is crucial for non-inferiority trials. Two aspects of this study deserve further attention: the “role assignment” of the different treatments and the sample size calculation in the context of a non-inferiority trial.
The first issue relates to the role assignment chosen by the authors for ORT and IVF in their trial. In the context of a non-inferiority trial, the experimental procedure/treatment must be tested against the most current gold standard therapy (known as the active-control) (3). Traditionally, the inherent therapeutic value of the gold standard is documented with placebo-controlled trials. In this case, it would be resonable to assume that both IVF and ORT are superior to placebo, but which one is the gold standard, and deservedly, the active-control in the trial? Recent evidence from a small randomized controlled trial mentioned by Spandorfer and colleagues suggests that ORT is superior to IVF (3). Furthermore, according to the authors, “ORT is recommended by the AAP and the WHO as first-line therapy for mild to moderate dehydration.”(1) In this context, it is very interesting to note that the role of active- control was paradoxically assigned to IVF. Unfortunately, no reference to placebo-controlled studies are provided to support the choice of IVF as an active-control. In theory, this constitutes not only an effective demotion of ORT within the hierarchy of possible therapeutic avenues, which does not appear to be supported by a priori evidence, but it may also insiduously promote the primacy of physicians’ pragmatic preferences over available evidence.
This leads to two important questions. First, does this constitute evidence of the authors’ occult biases, crystalized for us in the methodology of their study? A contrario to their hypothesis that “if ORT was shown to be as effective as IVF, practitioners might be more likely to adopt [ORT] in their practice”, we contend that it as likely that demonstration of non-inferiority could also be interpreted quite differently by physicians : why should I change my practice if it makes no difference? A superiority study with head-to-head comparison of ORT versus IVF (as in the study by Gomberg-Maitland M et al. (3)) would have been much more informative, and may have effectively dismantled the current status quo. By chosing a non-inferiority trial design and opting for IVF as the active control for no clear reason, one may argue that the fate of ORT versus IVF was essentially sealed from inception.
Second, and more importantly, could the conclusions of the study have been different if the role assignment was inverted? Indeed, from a purely logical standpoint, since ORT is established as first line therapy, it seems rather obvious that it should at least be non-inferior to IVF, otherwise IVF would be first line therapy (given that the evidence available is solid, which it is not). Note that the opposite is not necessarily true. Let us illustrate this concept with an example: if treatment A is established as the gold standard for disease X, asking whether A is non-inferior to a new, unproven, experimental treatment B may be seen as a rather fruitless demonstration. Indeed, if treatment B has no activity whatsoever against disease X, treatment A will inevitably be found non-inferior to B – non-inferiority trials are not designed to demonstrate superiority, nor equivalence for that matter (2). What clinicians want to know is whether B is non-inferior to A, or not; the inverted proposition is arguably of questionable usefulness from a clinical standpoint.
The second issue pertains to sample size calculation, which is particularly important to maximize the reliability of the results obtained in non-inferiority trials (4). Typically, sample sizes in non-inferiority trials are much larger than that of placebo-controlled trials, owing to the stringency of the delta value used. In this case, an a priori expected sample size of 50 per arm appeared, at first glance, rather small (the actual total number of patients included in the final analysis is 73) (1). Unfortunately, the methodological approach used to determine sample size is not mentioned nor referenced by the authors. Sample size calculation was thus performed according to the guidelines provided by Jones et al. to determine the optimal sample size for a one-sided equivalence trial (i.e., a non-inferiority trial)2. It is noteworthy that this article is widely cited to support sample size calculations for such trials (Web of Science reports more than 280 citations).The formula is as follows:
N = 2p(100-p)/delta squared * [z(1-alpha)+ z(1-beta)]squared
where N is the sample size, p the overall percentage of successes to be expected if the treatments are equivalent, delta the margin of non- inferiority, alpha the type I error probability (significance) and beta the type II error probability (1-beta = power). Since a failure rate of 20% for ORT is mentioned in the article, it will be assumed that the expected rate of success is 80% for both arms. The margin of non- inferiority was set at 5%. The one-sided alpha was 0.05, and the study was powered at 80% (i.e. equivalent to Z-scores of 1.65 and 0.84, respectively). Assuming that all these values were correctly extracted from the article (and they may not since they were not all specifically and clearly identified), and substituting them in the equation mentioned above, we obtain
N = 2*80(100-80)/(5%)2 *[1.65+0.84]2
N = 793 patients per arm
N total = 2N = 1586 patients
According to this calculation, an optimal sample size to draw reliable conclusions from a non-inferiority trial using the pre-specified parameters delineated by the authors would be 1586 patients. Considering the short time course of this study (4 hours), it would be reasonable to argue that adding 10-25% patients to the calculated N, to account for “dropouts”, would be unnecessary; the reasons underlying this decision should nonetheless be stated. It is noteworthy that raising the margin of non-inferiority to 10% would have reduced N by a factor of 4. On the other hand, raising the power to 90% (Z-score of 1.28) would have increased N by 40%.
Assuming that the above calculations are sound, it appears as though the analysis presented was carried out on a sample size consisting of a mere 5% of the minimal number of patient that should have been expected. Jones et al. warned that “the finding of equivalence [or non-inferiority] may arise either from true equivalence [or non-inferiority] or from a trial with poor discriminatory power – a trial which was too small.” (1) If this study is truly underpowered to reject the null hypothesis, as demonstrated above, must we conclude that the validity of the conclusions drawn is questionable? In the event that another validated method for sample size calculation was used, the equation(s) used as well as the source reference would be greatly appreciated.
Despite the pragmatic attractiveness of the results presented by Spandorfer et al. for any pediatric emergency department, it seems reasonable to expect that the issues raised herein should be addressed by the authors. It would be particularly important for the authors to clarify further why the decided to perform a non-inferiority trial, and not a superiority trial.
Competing interests: none declared.
1. Spandorfer, PR et al. (2005) Pediatrics 115:295-301.
2. Gomberg-Maitland M et al. (2003) Am Heart J 146:398-403.
3. Atherly-John YC et al. (2002) Arch Pediatr Adolesc Med 156:1240- 1243.
4. Jones B et al. (1996) BMJ 313:36-39.