++
The descriptive studies and the studies of familial aggregation
considered so far can provide useful, but indirect, information
about possible genetic and environmental causes of disease. If these studies
suggest that the disease has, at least in part, a hereditary basis,
further research can focus on the risks associated with specific
genetic factors such as enzymes, receptors, or structural proteins.
The following sections present a direct and an indirect approach
to studying the association of disease with specific genetic factors.
++
To understand both the direct and indirect approach, as well
as the more complex genetic linkage analyses, we must first understand
genetic linkage. Linkage between one
gene and another arises because alleles that are closely situated
on the same chromosome tend to be passed together as a group to
offspring.
++

Thus, if a subject inherits a particular allele, he or she will
tend to also inherit other alleles present nearby on the same chromosome,
a situation termed
cosegregation.
++
At the population level, recombination or crossing over between
genes tends to eliminate or reduce imbalances related to genetic
linkage. This recombination tends to create an equilibrium in which
the frequency with which two alleles (at different genes) occur
together is simply the product of the frequencies with which each
allele occurs in the population. Nevertheless, for some gene pairs
an excess or deficiency of certain combinations of alleles occurs,
a situation termed linkage disequilibrium. Such
linkage disequilibrium can be present if a mutation (ie, a new allele)
has recently arisen in a population, or if certain combinations
of alleles at the two loci confer a survival advantage over other
combinations.
++
With the direct approach, the researcher directly studies the
possible increase in risk associated with a specific factor, such
as an alteration in the DNA sequence or variations in enzyme activity. Using
a cohort study, the researcher compares the risk among those with
a specific variation in the DNA sequence or enzyme activity with
the risk among a similar group without that variant (Figure 11–5).
Following the principles outlined in Chapter 8: Cohort Studies for cohort studies
and defining the “exposure” to be those with a
particular genotype, the researcher follows two groups of subjects—one
with the genetic variant and the other without the genetic variant—and
then determines subsequent occurrence of disease. The investigator
then calculates risks for each group, as well as the risk ratio
that measures the increase (or decrease) in risk associated with
the genetic variant. As in other cohort studies, subjects should
be free of disease at the start of follow-up and should be comparable
except for the genetic variant of interest.
++
++
Alternately, the researcher can use a case–control design
to study the association between risk of disease and a specific
genetic factor. In this type of case–control study, the “exposure” is
the genetic factor of interest (Figure 11–6). As in other
types of case–control studies (Chapter 9: Case–Control Studies), cases consist
of people with the disease of interest, and investigators select
controls from the source population, that is, the population which
gave rise to the cases. After assembling the case and control groups,
the researcher compares the frequency of the genetic factor of interest
among the cases to the corresponding frequency among the controls.
The odds ratio—the odds of the genetic factor among cases
divided by the odds of the genetic factor among controls—provides
an approximation to the risk ratio, as described in Chapter 9: Case–Control Studies. If
cases have an elevated frequency of the genetic factor, the researcher
must consider the possibility that this excess could be secondary to
the disease itself or its treatment, rather than a cause of the
disease. This possibility could occur, for example, in a study of
the association between a particular mutation and risk of developing
leukemia if the mutation was measured in blood obtained from cases
after chemotherapy, since certain chemotherapeutic agents are known
to be mutagens.
++
++
As with other types of epidemiologic studies, researchers must
exercise care in interpreting results of genetic epidemiologic investigations.
They must consider the possibility of selection bias, confounding,
and misclassification in the interpretation of genetic epidemiologic
findings, as those biases can affect genetic epidemiologic results
just as they can affect other types of research. The possibility
of selection bias must be considered, particularly in case–control
studies, if controls are not selected from the source population
that gave rise to the cases. The possibility of confounding must
be considered, particularly by factors such as race and ethnicity,
as these factors relate to risks for many types of disease and also
can be linked with genetic markers.
++
Another important consideration arises in interpreting studies
of the association between risk of disease and a specific DNA alteration,
as measured by a marker allele. This kind of study is typified by
studies of the association between insulin-dependent (type I) diabetes
mellitus and certain human leukocyte antigens (HLAs). These studies
have documented an increased risk of diabetes mellitus in association
with the presence of the HLA DR3 allele. It is possible, however,
that it is not the HLA DR3 allele per se, but rather some companion
alteration at a nearby genetic locus, that is responsible for increasing
the risk of developing diabetes mellitus. This phenomenon can arise
if noncausal marker alleles at one genetic locus are in linkage disequilibrium with alleles
at the causal genetic locus. If linkage disequilibrium is present,
an association between risk of disease and the marker may reflect
linkage disequilibrium between the marker and the gene that confers
disease susceptibility—and not a causal association between
the marker itself and risk of disease. Nevertheless, direct evidence
of increased risk in association with a marker allele, if valid,
suggests that the marker allele or another allele linked to it relates
to susceptibility to disease.
++
With increasing frequency, researchers are now using single nucleotide
polymorphisms or SNPs to measure alterations in the DNA sequence.
The SNP reflects a change in a single base-pair at particular point
in the DNA sequence. An advantage of SNPs is that they can be measured
relatively easily in the laboratory, but a disadvantage is that
many SNPs may be required to detect a relevant mutation in a particular
gene since the SNP measures change only at a single point.
++
Finally, it should be noted that consideration of possible gene–environment
interactions is important. For example, failure to account for important
environmental factors in the design or the analysis of the study
can lead to an underestimate of the influence of a genetic factor.
The genetic characteristic of interest may contribute to risk of
disease only in the presence of some environmental trigger. Accordingly,
failure to account for a critical environmental trigger may obscure the
role of an underlying genetic susceptibility.
+++
The Indirect
Approach
++
With the indirect approach, the researcher also looks for evidence
of an association with a genetic factor. The indirect approach differs
from the direct approach in that a marker is used for the genetic
factor that may not be the same as the causal genetic factor itself.
This approach depends on genetic linkage between
the genetic marker and the disease susceptibility gene. Documentation
of an association between the marker and risk of disease provides
evidence that a gene near the marker allele affects risk of disease.
++

Linkage analysis (1) provides information about the magnitude
of risk association with a genetic marker, (2) identifies the chromosome
that bears the susceptibility gene locus, and (3) provides information
about the site on the chromosome of a disease susceptibility gene.
++
A straightforward epidemiologic approach to the study of linkage
involves estimating recurrence risks in siblings of index subjects
with a specific disease. For a particular marker locus, a sibling can
share by descent either zero, one, or two marker alleles with the
case. If the marker locus is linked to a disease susceptibility
locus, a sibling who shares two alleles with the index case should have
the highest chance of having the same disease susceptibility allele
as the case—and hence the highest risk of disease. A sibling
who shares one allele with the case should have an intermediate
chance of having the same disease as the case, and a sibling who
shares no alleles with the case should have the lowest chance of
having the same disease susceptibility allele and hence the lowest
risk of disease. If risk of disease in siblings parallels the number
of alleles shared with the case, the data suggest the presence of
a disease susceptibility gene near the marker gene on the same chromosome.
This type of study, sometimes called sib-pair
analysis, is a simple example of genetic linkage analysis.
++
In more complex applications of genetic linkage analyses, the
geneticist measures a genetic marker among a group of family members
that is not restricted to siblings. If subjects who share the genetic
marker, such as a particular allele, also tend to be concordant
for occurrence of disease, the disease susceptibility gene may be
located on the same chromosome near the marker allele. As with sib-pair
analyses, these more complex linkage analyses rest on the rationale
that if the measured genetic marker is located near the disease
susceptibility allele, the marker allele and the susceptibility
allele will tend to be passed together, that is, to cosegregate.
Cosegregation of the marker allele and the disease susceptibility
allele will then create concordance between occurrence of disease
and presence of the genetic marker. Example 4 illustrates a linkage
study.
++
Example 3. To study the role of
apolipoprotein E (Apo E), a specific genetic factor suspected of
playing a role in the causation of Alzheimer’s disease,
Tsai and coworkers (1994) conducted a case–control study.
They obtained blood from 77 cases with Alzheimer’s disease
and 77 matched controls, then determined the presence of three isoforms
of Apo E—denoted e2, e3, and e4. About 35% of
the cases had the e4 allele, compared with 13% of controls.
The results support the role of Apo E as a risk factor for Alzheimer’s
disease. Although the mechanism remains uncertain, results of other studies
suggest that Apo E may bind to β-amyloid and change it to a neurotoxic
form.
++
Example 4. To help in the search
for possible genetic factors associated with Alzheimer’s
disease, Schellenberg and colleagues (1992) conducted linkage analyses
in a series of 14 families, each of which had at least three members
with Alzheimer’s disease in at least two generations. Using
several markers for loci on chromosome 14, the investigators found
that within some families with early-onset Alzheimer’s
disease, members who shared the same markers tended to be concordant
for Alzheimer’s disease. The evidence suggested that a
gene on chromosome 14 codes for susceptibility to early-onset Alzheimer’s
disease. If so, this would be a separate locus from the one coding for
apolipoprotein E, as the Apo E gene is located on chromosome 19.