The rest of this section of the book focuses on specific approaches to analyzing healthcare data: that is, how to use analytics to see the truth in the world so that you can improve patients’ health and healthcare. All these approaches, though, build on an understanding of several core principles about healthcare data. That is the focus of this chapter: These principles are important across all analytic techniques, and they represent issues that are easy to miss but can cause you to get the wrong answer if you don’t pay attention to them. Understanding these core concepts will help you select the right analytic method for the problem at hand. This chapter will introduce these core concepts, including data types (e.g., binary versus continuous), why missing data are critical, and the Four Horsemen of Mistaken Conclusions (chance, confounding, and bias — and violating the assumptions of your analytic method). It will also provide a brief overview for the analytic methods that will come in the following chapters.


The electronic record created as part of a healthcare encounter is not a patient. Rather, the record is made up of the electronic footprints that the patient leaves behind from her or his interactions with the health system. That sounds obvious, but it is remarkably easy to forget. What’s left in the record is a pale, sparse reflection of the complexity of a human being … a reflection that is transformed, diluted, and adulterated by the environment, by financial considerations, and by clinical interfaces that sometimes are difficult to use.

That doesn’t mean that we can’t learn anything from patient data generated as part of routine care—exactly the opposite! We often think of this like dinosaur footprints, which have been tremendously important to modern paleontology1 (Figure 13-1). A dinosaur footprint isn’t a dinosaur, but rather it represents a dinosaur’s interactions with the world around it … mitigated and transformed over the subsequent millions of years. Most of the information about the dinosaur is lost, but we can still make inferences from what’s left behind. In panel A of Figure 13-1, a single dinosaur footprint has three toes, which is a characteristic of a set of dinosaurs called theropods (a group which includes the largest land-dwelling carnivores ever discovered, like Tyrannosaurus rex). Panel B shows how—for many decades—scientists thought the T. rex’s skeleton looked … standing upright, dragging its tail. Based on many pieces of evidence, though, paleontologists revised how they thought T. rex stood and how its skeleton was put together—with a much more horizonal orientation (panel D). One of the pieces of evidence was studying trackways—fossil records of multiple dinosaur footprints as they walked across the land (panel C shows a trackway of the same theropod shown in panel A). It turns out to be remarkably ...

