I’m queueing up a few new COVID posts: antigen testing, the new UK variant, the question of whether you can see people freely once vaccinated, etc. And of course I’ll keep you updated as new information comes up on kids, vaccines, pregnancy and so on. But: in honor of my first real post of the New Year, I’m flashing to what could have been, in the absence of COVID-19. That is, a newsletter about new studies, what they tell us (or don’t), served with a side of econometrics.
Today: Fitness trackers.
New Years often comes with resolutions and, for many, these surround fitness: diet, exercise, some commitment to working off the excesses of December. In a non-pandemic year, gyms run full steam ahead in January, with the surge tailing off mid-February (post-Valentines slump). As part of the fitness push — especially in the absence of gyms — I suspect many people will turn to some type of activity tracker. Activity trackers tell you, broadly, how much you are moving in a given day. They’re often set up to count steps. And many people hope such a device can help them hit that magical (but totally made up) 10,000 step a day target.
Question: is that true? Will a device really help?
Recent research may suggest yes. In early December, the British Journal of Sports Medicine published a study — a meta-analysis — which combined a evidence from 28 randomized trials including a total of about 7,500 people. The trials evaluated the impacts of activity trackers (plus, in many cases, other reminders or supports) on number of steps taken. On average, studies found a moderate increase in steps: treatment group members took about 1850 more steps per day than the control. The effects were larger when combined with text messages or other reminders.
Broadly, this is good news for trackers. Of course, there are quibbles about these results. Most of the study participants (75%) were men, for example; and most of the studies followed them for only a few weeks or months. There’s also some lack of consistency across interventions. BUT: what makes this evidence compelling is that the studies are randomized.
I complain a lot about non-randomized data. A study of fitness trackers with non-randomized data would be worthless. Of course people who own fitness trackers walk more, but this basic comparison wouldn’t be helpful given the other differences across groups. Randomized data is much better; by taking a bunch of people without trackers and randomly picking half of them to get the trackers, you can be (more) confident that any differences across groups are due to the tracker.
Is that the whole story then? Should you definitely get a tracker so you can get that (average) of 1850 extra steps each day? Or: if you’re a doctor and thinking about how to motivate your sedentary patients, should you prescribe them an activity tracker?
Despite this evidence, I think it’s not so clear. Here’s where the slightly subtle econometrics comes in. When you run a randomized trial, you are able to estimate a causal effect for the people in the trial. If those people are, themselves, a random sample of the overall population then your causal effect in your trial is the correct causal effect for the overall population.
But: what if the people in your trial are not a random sample? Let’s imagine, for example, that the way you recruit people for your fitness tracker trial is by putting up signs in the subway inviting people to enroll. Who is going to want to be in your trial? One (likely) possibility is that you’ll get a disproportionate number of people who are interested in trying to walk more. This desire to walk more is what attracted them to your subway sign.
So, now, you’ll take this non random sample of the overall population and you’ll run your randomized trial on them. You’ll get a causal effect for the people in the trial. But is it the same effect you’d expect in the overall population? The answer is: it depends.
In some cases, we expect a treatment to have the same effect on everyone. Let’s say instead of evaluating the effect of activity trackers on count of steps, you were evaluating whether hitting people in the shin with a stick really hard gives them a bruise. That’s the treatment: hitting people in the shin. The impact of this treatment is likely to be basically the same for everyone. With a few exceptions, nearly everyone will bruise if they are hit hard in the shin with a stick. So, you’d get the overall right answer to your question even if the people who agreed to be in your trial were kind of unusual (which they probably would be!).
In econometric terms, we’d call this a homogenous treatment effect. The treatment effect is the same for everyone. Therefore, even if your treatment group is a non-random sample, you’re fine using it to learn about the whole population.
However, in many cases we think there are heterogenous treatment effects, meaning that the effect is not the same for everyone. And then we can be in trouble. In the particular case of activity trackers, it’s easy to imagine that the people who sign up to e be in these studies are motivated to use the tracker. They may have a larger treatment effect than random people from the population. The effects in these randomized trials would, then, overstate the overall population effects. (They could also understate them, for example if the control group without the trackers were motivated in other ways. If you want to really nerd out on this I have a paper on how one might figure out if the effects are larger or smaller in the trail population than the overall population).
In the particular setting of this meta-analysis, my guess is that the treatment effects in the studies are larger than what you’d get in the overall population. That is to say: if doctors randomly handed out fitness trackers to people, I would not expect an increase of 1850 steps on average. A lot of people would leave the trackers in their nightstands and never put them on.
Having said that: if you yourself are motivated to get a tracker, then the effects estimated in the study may be more relevant to you. It really depends on the question we are asking or, rather, who we are asking it about.
This point is broader than this example. It comes up in thinking about the general conclusions of virtually any randomized trial. There is no question that we learn more from randomized trials than from observational data, but they aren’t a panacea.
There is, of course, always a COVID connection. If you look at much of the discussion around the vaccine trials, there was a lot of emphasis on enrolling a diverse group in the trials and, in particular, making sure that there was representation from communities of color who have been harder hit by by the virus. One reason for this was the concern that the treatment effect of the vaccine — its efficacy — would be different for different groups. If you tested the vaccine only in white men aged 35 to 45, and it turned out that the effect was very different in Black men, or in women, or in older people, your results wouldn’t be very helpful in predicting the overall impacts of the vaccine in the population.
Fortunately, the vaccines we have do seem to be similarly effective across a wide range of groups. Now all we have to do is actually vaccinate people.
Yes, I have a tracker watch. It’s a Garmin Forerunner 45. I like it! I mostly use it for running but it does apparently also track steps. Current daily count: 17,369.
Keep the thoughts coming. I don’t always write back, but I read everything.
Where to Find Me