Roundup: Statistics Training

One of the classes I sometimes teach at Brown is a first-year seminar called The Power of Data. On the first day of class, I start with this fact, from the CDC: In 2017-2018, 42.7% of Americans were obese. I ask students a simple question: How do we know this?

Usually someone will respond with their first instinct: “Well, we know because we weighed them.” And then I ask the obvious follow-up: “Do you remember being weighed?” They do not, of course, and we are off and running. In fact, the figure is based on data from the National Health and Nutrition Examination Survey, which includes about 11,000 people each year. This leads into a discussion about sampling — when is this sample size sufficient, and for what conclusions? 

It also leads us to a conversation about how we communicate about data. Why do we say 42.7% of Americans are obese, when we really mean 42.7% of this particular sample? When do we need to give more details?

The rest of the course is much like this, trying to help my students see where data comes from, what we can do with it, and what its limitations are.

The point of this story is that I like data. A lot. It’s what gets me up in the morning. I like collecting data, but even more than that, I like thinking about how to learn from it. How can we use what we see to tease out relationships that might be hard to see? What statistical methods are useful, and which are less useful? How can we make them better, or at least evaluate their quality?

This post is for paid subscribers