Three factors help determine whether an observed estimate, such as the mean, is different from a norm: the size of the difference, the degree of variability, and the sample size.
The t distribution is similar to the z distribution, especially as sample sizes exceed 30, and t is generally used in medicine when asking questions about means.
Confidence intervals are common in the literature; they are used to determine the confidence with which we can assume future estimates (such as the mean) will vary in future studies.
The logic behind statistical hypothesis tests is somewhat backward, generally assuming there is no difference and hoping to show that a difference exists.
Several assumptions are required to use the t distribution for confidence intervals or hypothesis tests.
Tests of hypothesis are another way to approach statistical inference; a somewhat rigid approach with six steps is recommended.
Confidence intervals and statistical tests lead to the same conclusions, but confidence intervals actually provide more information and are being increasingly recommended as the best way to present results.
In hypothesis testing, we err if we conclude there is a difference when none exists (type I, or α, error), as well as when we conclude there is not a difference when one does exist (type II, or β, error).
Power is the complement of a type II, or β, error: it is concluding there is a difference when one does exist. Power depends on several factors, including the sample size. It is truly a key concept in statistics because it is critical that researchers have a large enough sample to detect a difference if one exists.
The p value first assumes that the null hypothesis is true and then indicates the probability of obtaining a result as or more extreme than the one observed. In more straightforward language, the p value is the probability that the observed result occurred by chance alone.
The z distribution, sometimes called the z approximation to the binomial, is used to form confidence intervals and test hypotheses about a proportion.
The width of confidence intervals (CI) depends on the confidence value: 99% CI are wider than 95% CI because 99% CI provide greater confidence.
Paired, or before-and-after, studies are very useful for detecting changes that might otherwise be obscured by variation within subjects, because each subject is their own control.
Paired studies are analyzed by evaluating the differences themselves. For numerical variables, the paired t test is appropriate.
The kappa κ statistic is used to compare the agreement between two independent judges or methods when observations are being categorized.
The McNemar test is the counterpart to the paired t test when observations are nominal instead of numerical.
The sign test can be used to test medians (instead of means) if the distribution of observations is skewed.
The Wilcoxon signed rank test is an excellent alternative to the paired t test if the observations are not normally distributed.
To estimate ...