Ingesting, checking, and combining the COVIDcast data
Ingest the .csv files, check the assumptions, and combine the interesting values into a single table
Here are the steps:
-
Copy the data from each
.csv
file "as is" into a dedicated staging table, with effective primary key ("state, survey_date)". (The qualifier "effective" recognizes the fact that, as yet, these columns will have different names that reflect how they're named in the.csv
files.) -
Check that the values from the
.csv
files do indeed conform to the stated rules. -
Project the columns of interest from the staging tables and join these into a single table, with primary key ("state, survey_date)" for analysis.
All of these steps are implemented by the ingest-the-data.sql
script. It's designed so that you can run, and re-run, it time and again. It will always finish silently (provided that you say set client_min_messages = warning;
) Each time you run it. It calls various other scripts. You will download these, along with ingest-the-data.sql
, as you step through the sections in the order that the left-hand navigation menu presents.