As many of you know, last year I worked with a great team to run a (close-to-real-time) dashboard on COVID and schools.
I’m excited today to announce the launch of the next phase of that project: The COVID-19 School Data Hub. The Data Hub (which you can explore here) aims to be a comprehensive look-back at schooling mode across the US over the 2020-2021 school year (e.g., in-person, hybrid, or virtual learning). We show how learning models looked by district between August 2020-May 2021 using the map on the homepage, and you can download individual data files by state. Some of these are at the district level, and others are available down to the school level.
For the moment, these data are a look back and not a look forward. But this process has reinforced the need for better real time data on what is going on in schools, especially as education leaders weigh reopening decisions in light of the Delta variant. When combined with other data, these resources will help us answer a lot of questions about the impacts of closures over the last school year and hopefully help us explore how we can limit impacts of closures over this next one.
I’m going to say a bit below about what is in the data, but my biggest pitch now is go explore for yourself. The team has spent months cleaning this dataset and we are hoping it will facilitate answers to questions we have, and ones we haven’t yet thought of.
Let’s start with the basics. Here we go with two screenshots: learning models in November of 2020, and in April of 2021.
You can see that a lot more districts opened for in-person learning as the year went on. You can also see the geographic differences, especially in the fall, with some areas of the country having much more consistent in-person learning than others.
In the course of this project, we reached out to state education agencies (i.e., state Departments of Education) in an effort to get data from them. Currently, the Data Hub includes data from 30 states (including DC), with data from more states in the process of being integrated in the coming weeks and months. However, the data availability varied considerably. You can see the breakdown in the table below.
Some states have comprehensive data down to the individual week at the school level. Others do not have any data available even at the district level on schooling mode. Still others may have collected this information but have chosen to not make this information available - yet. Also, because of the great variability in how states collected this information, we have done our best to standardize the data in order to be able to look at it all together.
We haven’t given up on the states with incomplete or missing data! We’ll continue to build this out as we get more in.
For states where learning model data are available on the Data Hub, individual state pages show what data we have and allow you to download it. And if you want to work with all the data together, the For Researchers page has all the data zipped into a single folder, code which will compile it, and some auxiliary data sources which can be merged in.
What’s this Good For?
We’re hoping this will be helpful in many ways. One is to clarify the basic question of just how much in-person schooling there was last year. But, perhaps more importantly, this dataset will provide a starting point for researchers to explore the extent to which outcomes -- for kids, for adults, for COVID, for society -- are driven by school and district learning models.
To give one example, several weeks ago Virginia released test score data for the last school year, at the district and school level. Comparing student outcomes to those from the 2018-19 school year (data from 2019-20 were not available due to the pandemic), we might expect to see similar levels of student proficiency. However, the data reflect lower levels of student proficiency in 2020-21 across subject areas. Furthermore, we were able to explore differences in learning loss by schooling mode by merging the district-level student outcome information with our data on learning models. The preliminary results -- shown in the figure below -- suggest that these learning losses were largest in areas which were predominantly virtual (notably, these locations also started at a lower proficiency level). For more details, you can take a look at our preliminary white paper here (see State Snapshot of Test Scores and Pandemic Learning Models: Virginia).
There is other data! We compiled data from states on COVID-19 cases in students and staff over the year, and worked to learn as much as we could about masking policies. These data are all publicly downloadable.
To see a bit more about our data collection and some summary information, please refer to our white paper, COVID-19 School Data Hub: Introduction and Data Availability Overview, available here.
This is an enormous project. You can see the whole Team here. Clare Halloran is the Project Manager and overall boss, and Rebecca Jack is the Research Director. Matt and Sam from Township Agency have done an unbelievable job on the website. I am just the mouthpiece: they, along with everyone else, do the work.
The initial data collection here was largely the work of individual states, and we are so thankful for their work and partnership. We also wanted to thank Melissa McGrath at CCSSO for all her tireless work and coordination and connections and Nat Malkus at AEI for similar.
We are grateful for funding from Emergent Ventures at the Mercatus Center, the Chan Zuckerberg Initiative, Arnold Ventures and Brown University.
Please send any thoughts or correspondence to email@example.com.