# Appendix A: Estimation of Sample Size Requirements for Randomized Controlled Clinical Trials

## Appendix A: Estimation of Sample Size Requirements for Randomized Controlled Clinical Trials

The formulas used to estimate sample size requirements are provided in this appendix. Also provided are illustrative calculations relative to the Diabetes Control and Complications Trial described in Chapter 7: Clinical Trials.

Prior to undertaking this study, the investigators specified an alpha level (0.05, or 5%), statistical power (90%, and thus a beta level of 10%), and the outcome difference that should be detected by the trial (a reduction in the proportion of patients diagnosed with diabetic retinopathy from 20% to 10%). The baseline proportion of subjects who would develop retinopathy is derived from previous literature. The amount of reduction in retinopathy is based on clinical judgment; the following question was posed: “What would be a clinically important difference in the proportion of patients who would suffer this complication?”

where *n* is the number of subjects
for each treatment group, πc and πt are the proportion
of patients that develops retinopathy within 5 years in the control
group (standard therapy) and treatment group (intensive therapy),
respectively, and *z*a and *z*b are the values that include
alpha in the two tails and beta in the lower tail of the standard
normal distribution. These values can be determined from tables
available in most statistical texts (see Dawson and Trapp, 2004;
complete publication data can be found at the end of Chapter 7: Clinical Trials).
The value for a type I error of 5% is 1.96, and the *z*b value for a type II error
of 10% is –1.28. As the acceptable level of error
decreases, *z*a and *z*b increase.

Note that in equation (1), the larger the *z*a and *z*b—that is, the smaller
the acceptable type I and type II errors—the larger the
sample size required; also the smaller the difference in πc and
πt, the larger the sample size required. What may not be
so intuitively obvious is the relation of sample size to the distance
of πc from 0.5. The part of the equation πc(1 – πc)
is maximized, and therefore the numerator is greater when πc = 0.5.
Movement of πc away from 0.5 reduces the required sample
size.

If we expected the proportion of patients on standard insulin therapy for diabetes that would develop retinopathy by year 5 to be 0.20, and we wanted this trial to be able to detect a reduction in retinopathy at 5 years from 0.20 to 0.10, then the sample size would be calculated as follows:

and *n* = 305. Therefore,
a total of 610 subjects equally divided between groups would be
required to answer the following question: “Is there a
reduction in the rate of retinopathy at 5 years from 20% to
10% using intensive rather than standard insulin therapy?” This
can be restated as follows: If the true difference in rate of retinopathy
at 5 years is 10% versus 20%, then the probability
that the researchers will find no difference between the proportion
of subjects developing retinopathy during the first 5 years of therapy
with an equally divided sample size of 610 would be only 10%.

Because the diabetes trial had over 305 subjects in each group, the likelihood of not finding a true difference of this magnitude was actually less than 10%.

If average glucose levels 5 years after beginning therapy had been the measure chosen to compare the two treatment groups, the required sample size could have been determined using the following equation:

where *n* is the number of subjects
for each treatment group, μ1 - μ2 is the detectable
difference between the means of the two groups, σ is the common
standard deviation of each group, and *z*a and *z*b have the same meaning as
in equation (1).

Again, without memorizing this formula, we can intuitively understand
how its various components contribute to sample size. The greater
the absolute values of *z*a, *z*b, and σ, and the smaller
the difference in the means, μ1 - μ2, the larger
the *n,* or sample size, required (see
Table 7–3). This makes sense, as smaller differences in
means between groups would be harder to detect, and greater variability
within the groups would tend to blur intergroup differences. As
in all sample size calculations, the larger the values of *z*a and *z*b—that
is, the smaller the acceptable type I and type II errors—the
larger the sample size required.

Suppose the investigators estimated that the mean glucose levels at 5 years after the start of therapy would be 200 mg/dL for patients on standard insulin therapy and 175 mg/dL for those on intensive therapy, and the pooled standard deviation would be 45 mg/dL. The sample size for each group would be calculated using equation (3):

and *n* = 68. This end point
would have required far fewer patients to be enrolled in the trial.
At the conclusion of the trial, treating physicians may not have
considered a reduction in glucose levels to be a sufficiently important
outcome to warrant a change to the more intensive therapy; however,
if the trial found that intensive therapy reduced the onset of retinopathy,
intensive therapy would be judged to be a superior treatment regimen.