The Snowflake Data Marketplace is quickly becoming one of my favourite things about working at Snowflake. With the marketplace I have live, instant access to some of the worlds best datasets to give my data deeper context and meaning. I had the pleasure of recently presenting a virtual hands-on lab (recording) with Snowflake using Starschema’s COVID-19 dataset (link) and Wunderman Thompson’s COVID-19 Risk, Readiness and Recovery dataset (link) built on the backbone of their fabulous Amerilink Insights dataset. I highly recommend both datasets to drive a deeper understanding of our collective pandemic context. The best part is they are both free to play with, free! I love me some free data…let’s geek out!

Getting and querying data from the Snowflake Data Marketplace is dead simple and built on a very powerful premise. After 3 clicks of the mouse, I have data that is ready for analysis and always in sync with Starschema’s or Wundermand’s latest updates. This means I have instant access to a single-source of truth on COVID-19 cases as reported by public health authorities around the world along with key economic drivers to enhance my data.
Getting Started
First, you will need a Snowflake account if you want to run these queries. Snowflake offers free trials (link) to get started, even with these powerful datasets. If you have an existing account you must be the ACCOUNTADMIN to follow along. Grab a full copy of the SQL file (here).
Login to your Snowflake account and switch your context to ACCOUNTADMIN.

Browse to the Data Marketplace.

Get the Starschema COVID-19 set by browsing to the Health category or searching.

Get the data and assign to roles that need access, keep the name as “STARSCHEMA_COVID19” or the later queries will not work.

Starschema COVID-19 Data Set Queries
Now we are ready to start querying the data, please go ahead and create a new worksheet and paste this code for your own use. Be sure to set the session variables to a meaningful context or these queries will not run. Grab a full copy of the SQL file (here).
--+-----------------------------------------------------------------------------------------------+--
--+-----------------------------------------------------------------------------------------------+--
-- <Webinar SQL>
-- <Getting Started with the Snowflake Data Marketplace>
-- <September 9, 2020 | 10:00am PST>
-- <SQL File | David A Spezia>
-- <Sales Engineer | Snowflake>
--+-----------------------------------------------------------------------------------------------+--
--+-----------------------------------------------------------------------------------------------+--
--Session Variables
--Date to Use
SELECT DISTINCT DATE FROM PUBLIC.JHU_COVID_19 ORDER BY 1 DESC;
--Set Manually or From the MAx Value in the Data
SET DATE_VARIALBE = '2020-08-31';
SET DATE_VARIABLE = (SELECT TOP 1 DATE FROM PUBLIC.JHU_COVID_19 ORDER BY 1 DESC);
SELECT $DATE_VARIABLE;
--Date Floor for some Charts
SET DATE_FLOOR_VARIABLE = '2020-06-01';
SELECT $DATE_FLOOR_VARIABLE;
--Difference ><20 Makes Sense For My Charts
SELECT DATEDIFF('day',TO_DATE($DATE_FLOOR_VARIABLE),TO_DATE($DATE_VARIABLE));
--Country to Use, instead of Hard Coding USA
SELECT DISTINCT COUNTRY_REGION FROM PUBLIC.JHU_COVID_19 ORDER BY 1 DESC;
SET COUNTRY_VARIABLE = 'United States';
SELECT $COUNTRY_VARIABLE;
--+-----------------------------------------------------------------------------------------------+--
-- Star Schema JHU Data
--+-----------------------------------------------------------------------------------------------+--
--GET() Star Schema COVID-19 as "STARSCHEMA_COVID19" AS ACCOUNTADMIN IN Discover UI
USE ROLE ACCOUNTADMIN;
USE DATABASE STARSCHEMA_COVID19;
USE SCHEMA PUBLIC;
--Describe Table
DESC TABLE JHU_COVID_19;
--JHU_COVID_19 Columns
SELECT COLUMN_NAME, DATA_TYPE, COMMENT
FROM STARSCHEMA_COVID19.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'JHU_COVID_19';
--Limit 10 from JHU_COVID_19
SELECT * FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19 LIMIT 10;
--Cases vs Difference
SELECT SUM(CASES), CASE_TYPE, COUNTRY_REGION
FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19
WHERE DATE<=TO_DATE($DATE_VARIABLE) AND COUNTRY_REGION=$COUNTRY_VARIABLE
GROUP BY COUNTRY_REGION, CASE_TYPE;
--Cases vs Difference 2.0
SELECT SUM(CASES), SUM(DIFFERENCE), CASE_TYPE, COUNTRY_REGION
FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19
WHERE DATE<=TO_DATE($DATE_VARIABLE) AND COUNTRY_REGION=$COUNTRY_VARIABLE
GROUP BY COUNTRY_REGION, CASE_TYPE;
--Bar Chart: SUM([DIFFERNCE]) by [CASE TYPE] on X-Axis
SELECT SUM(DIFFERENCE), CASE_TYPE, COUNTRY_REGION
FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19
WHERE DATE<=TO_DATE($DATE_VARIABLE) AND COUNTRY_REGION=$COUNTRY_VARIABLE
GROUP BY COUNTRY_REGION, CASE_TYPE;
--Line Chart: Add [DATE] Column to Visual ON X-Axis
--Add [CASE_TYPE] to Series
SELECT COUNTRY_REGION, CASE_TYPE, DATE, RUNNING_SUM_COUNTRY FROM (
SELECT COUNTRY_REGION, CASE_TYPE, DATE, SUM(DIFFERENCE) OVER(PARTITION BY COUNTRY_REGION, CASE_TYPE ORDER BY DATE) AS RUNNING_SUM_COUNTRY
FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19
WHERE COUNTRY_REGION=$COUNTRY_VARIABLE)
WHERE DATE>=TO_DATE($DATE_FLOOR_VARIABLE)
GROUP BY COUNTRY_REGION, CASE_TYPE, DATE, RUNNING_SUM_COUNTRY
ORDER BY COUNTRY_REGION, CASE_TYPE, DATE;
--Line Chart: by Country > 500000
--Add [COUNTRY_REGION] to Series
SELECT COUNTRY_REGION, CASE_TYPE, DATE, RUNNING_SUM_COUNTRY FROM (
SELECT COUNTRY_REGION, CASE_TYPE, DATE, SUM(DIFFERENCE) OVER(PARTITION BY COUNTRY_REGION, CASE_TYPE ORDER BY DATE) AS RUNNING_SUM_COUNTRY
FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19 JHU
WHERE (SELECT SUM(JHU_COVID_19.DIFFERENCE)
FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19 JHU_COVID_19
WHERE JHU_COVID_19.COUNTRY_REGION = JHU.COUNTRY_REGION
AND JHU_COVID_19.CASE_TYPE = 'Confirmed') > 500000
AND CASE_TYPE = 'Confirmed')
WHERE DATE>=TO_DATE($DATE_FLOOR_VARIABLE)
GROUP BY COUNTRY_REGION, CASE_TYPE, DATE, RUNNING_SUM_COUNTRY
ORDER BY COUNTRY_REGION, CASE_TYPE, DATE;

Safegraph US Census Data & Neighborhood Insights Data Set
Now you are ready to augment the JHU data with populations. Thankfully, Safegraph has a free version of their US Census Data & Neighborhood Insights dataset. Get the Safegraph data from the Demographics category, get the database and assign to roles that need access, change the name to “SAFEGRAPH_SHARE” or the later queries will not work.

Safegraph Queries
--+-----------------------------------------------------------------------------------------------+--
-- Safe Graph Data
--+-----------------------------------------------------------------------------------------------+--
--GET() Safe Graph Data As "SAFEGRAPH_SHARE"
SELECT FIPS, PROVINCE_STATE, COUNTY FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19
WHERE STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19.COUNTRY_REGION = $COUNTRY_VARIABLE
AND STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19.DATE = TO_DATE($DATE_VARIABLE);
--Result Set: 100k Population SQL Result Set
SELECT PROVINCE_STATE, SUM(CONFIRMED_CASES)/SUM(COUNTY_POPULATION)*100000 AS CASES_PER_100K,
SUM(COUNTY_POPULATION) AS STATE_POPULATION , SUM(CONFIRMED_CASES) AS CONFIRMED_CASES
FROM (SELECT JHU.FIPS, SUM(JHU.CASES) AS CONFIRMED_CASES, JHU.PROVINCE_STATE, JHU.COUNTY, JHU.DATE, GEOSQL.COUNTY_POPULATION,
IFF(SUM(JHU.DIFFERENCE)>0,GEOSQL.COUNTY_POPULATION,0) AS AT_RISK
FROM STARSCHEMA_COVID19.PUBLIC.JHU_COVID_19 JHU
LEFT JOIN (SELECT LEFT(CBG,5) AS FIPS, SUM(GEO.TOTAL_POPULATION) AS COUNTY_POPULATION
FROM SAFEGRAPH_SHARE.PUBLIC.US_POPULATION_BY_SEX_GEO GEO GROUP BY FIPS) GEOSQL
ON GEOSQL.FIPS = JHU.FIPS
WHERE JHU.COUNTRY_REGION = $COUNTRY_VARIABLE AND JHU.DATE = TO_DATE($DATE_VARIABLE) AND JHU.CASE_TYPE='Confirmed'
GROUP BY JHU.PROVINCE_STATE, JHU.COUNTY, JHU.FIPS, GEOSQL.COUNTY_POPULATION, JHU.DATE
ORDER BY JHU.FIPS)
GROUP BY PROVINCE_STATE
HAVING CASES_PER_100K IS NOT NULL
ORDER BY CASES_PER_100K DESC;
--HAVING PROVINCE_STATE = 'California';

Weathersource Weather & Climate Data for BI Data Set
Now you are ready to further augment the JHU data with weather. Weathersource has a free version of their US Weather & Climate for BI dataset. Be sure to get that specific dataset. Get the Weathersource data from the Weather category, get the database and assign to roles that need access, change the name to “WEATHERSOURCE_PARTNER_WS_ONPOINT_WEATHER_DATA_SHARE” or the later queries will not work.

Weathsource Queries
--+-----------------------------------------------------------------------------------------------+--
-- Weather Source Data
--+-----------------------------------------------------------------------------------------------+--
--GET() Weather Source Data As "WEATHERSOURCE_PARTNER_WS_ONPOINT_WEATHER_DATA_SHARE"
SELECT COUNT(*)
FROM WEATHERSOURCE_PARTNER_WS_ONPOINT_WEATHER_DATA_SHARE.PUBLIC.HISTORY_DAY_COVID_19;
--Simple Test Query
SELECT DATE_VALID_STD, AVG(AVG_TEMPERATURE_AIR_2M_F)
FROM WEATHERSOURCE_PARTNER_WS_ONPOINT_WEATHER_DATA_SHARE.PUBLIC.HISTORY_DAY_COVID_19
WHERE COUNTRY = 'US' AND POSTAL_CODE LIKE '100%'
GROUP BY DATE_VALID_STD
ORDER BY DATE_VALID_STD DESC;
--Joining COVID Data with Weather Data
SELECT W.DATE,W.AVG_TEMP, C.COUNTY, C.CASES, C.CASES_SINCE_PREV_DAY
FROM (
SELECT DATE_VALID_STD DATE, AVG(AVG_TEMPERATURE_AIR_2M_F) AVG_TEMP
FROM WEATHERSOURCE_PARTNER_WS_ONPOINT_WEATHER_DATA_SHARE.PUBLIC.HISTORY_DAY_COVID_19
WHERE COUNTRY = 'US' AND POSTAL_CODE LIKE '100%'
GROUP BY DATE_VALID_STD
ORDER BY DATE_VALID_STD
) W INNER JOIN (
SELECT DATE, COUNTY, CASES, CASES_SINCE_PREV_DAY
FROM STARSCHEMA_COVID19.PUBLIC.NYT_US_COVID19
WHERE COUNTY = 'New York City'
ORDER BY DATE
) C ON W.DATE=C.DATE;

Wunderman COVID-19 Risk, Readiness & Recovery
Now you are ready to fully augment the JHU data with the amazing Wunderman data. Get the Wunderman Amerilink Insights data from the Marketing section, get the database and assign to roles that need access, change the name to “WUNDERMAN” or the later queries will not work.
The free, ready to query data set on the marketplace is just a random 1% sample of the overall data from the Wunderman data set. Request the “Personalized” version of the data set from the Marketplace to enter into a conversation with Wunderman Thompson about acquiring the unmasked, full data set for your needs.

Wunderman Queries
--+-----------------------------------------------------------------------------------------------+--
-- Wunderman Risk & Recovery Data
--+-----------------------------------------------------------------------------------------------+--
/* --GET() Wunderman Thompson AmeriLINK Insights Data Set as "WUNDERMAN"
---> Queries we will write <--
+ Age by State
+ 100k Infections by Avg(Age) by State
+ Respitory Issues per 100k by State
+ 100k Infections by Respitory Issues per 100k by State
+ Income by State
+ 100k Infections by Avg(Income) by State
*/
--Add in State Populations, Use Demo Database
CREATE DATABASE IF NOT EXISTS DEMO;
--Create Table for State Populations
CREATE TABLE IF NOT EXISTS "DEMO"."PUBLIC"."SATE_POPULTATIONS" (STATE VARCHAR, POPULATION NUMBER);
--2019 US Census Population Estimate by State
--Source: https://www2.census.gov/programs-surveys/popest/tables/2010-2019/state/totals/nst-est2019-01.xlsx
INSERT INTO "DEMO"."PUBLIC"."SATE_POPULTATIONS" VALUES
('Alabama',4903185),
('Alaska',731545),
('Arizona',7278717),
('Arkansas',3017804),
('California',39512223),
('Colorado',5758736),
('Connecticut',3565287),
('Delaware',973764),
('District of Columbia',705749),
('Florida',21477737),
('Georgia',10617423),
('Hawaii',1415872),
('Idaho',1787065),
('Illinois',12671821),
('Indiana',6732219),
('Iowa',3155070),
('Kansas',2913314),
('Kentucky',4467673),
('Louisiana',4648794),
('Maine',1344212),
('Maryland',6045680),
('Massachusetts',6892503),
('Michigan',9986857),
('Minnesota',5639632),
('Mississippi',2976149),
('Missouri',6137428),
('Montana',1068778),
('Nebraska',1934408),
('Nevada',3080156),
('New Hampshire',1359711),
('New Jersey',8882190),
('New Mexico',2096829),
('New York',19453561),
('North Carolina',10488084),
('North Dakota',762062),
('Ohio',11689100),
('Oklahoma',3956971),
('Oregon',4217737),
('Pennsylvania',12801989),
('Rhode Island',1059361),
('South Carolina',5148714),
('South Dakota',884659),
('Tennessee',6829174),
('Texas',28995881),
('Utah',3205958),
('Vermont',623989),
('Virginia',8535519),
('Washington',7614893),
('West Virginia',1792147),
('Wisconsin',5822434),
('Wyoming',578759);
--Check the Data Insert
SELECT * FROM "DEMO"."PUBLIC"."SATE_POPULTATIONS";
--Quality Controls People
SELECT T1.STATE AS WUNDERMAN_STATE , T2.STATE AS POPULATIONS_STATE, T3.PROVINCE_STATE AS JHU_STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT" T1
JOIN "DEMO"."PUBLIC"."SATE_POPULTATIONS" T2 ON T1.STATE = T2.STATE
JOIN "STARSCHEMA_COVID19"."PUBLIC"."JHU_COVID_19" T3 ON T1.STATE = T3.PROVINCE_STATE
GROUP BY 1,2,3
ORDER BY 1;
--Average Age and COVID
--+-----------------------------------------------------------------------------------------------+--
--Age by State
SELECT AVG(AGE::DECIMAL(5,2)) AS APPROXIMATE_STATE_AVG_AGE, STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT"
WHERE TRY_TO_NUMBER(AGE) > 0
GROUP BY STATE
ORDER BY APPROXIMATE_STATE_AVG_AGE DESC;
--100k Infection Rate by Avg(Age) by State
WITH casesByState AS (
SELECT SUM(JHU.CASES) AS CONFIRMED_CASES, JHU.PROVINCE_STATE AS STATE
FROM "STARSCHEMA_COVID19"."PUBLIC"."JHU_COVID_19" JHU
WHERE JHU.COUNTRY_REGION = 'United States' AND JHU.DATE = TO_DATE($DATE_VARIABLE) AND JHU.CASE_TYPE='Confirmed'
GROUP BY STATE),
statePopulations AS (
SELECT STATE, POPULATION
FROM "DEMO"."PUBLIC"."SATE_POPULTATIONS") ,
ageBySate AS (
SELECT AVG(AGE::DECIMAL(5,2)) AS APPROXIMATE_STATE_AVG_AGE, STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT"
WHERE TRY_TO_NUMBER(AGE) > 0
GROUP BY STATE)
SELECT statePopulations.STATE, SUM(CONFIRMED_CASES)/SUM(POPULATION)*100000 AS CASES_PER_100K,
AVG(ageBySate.APPROXIMATE_STATE_AVG_AGE) AS APPROXIMATE_STATE_AVG_AGE
FROM casesByState
JOIN statePopulations ON casesByState.STATE = statePopulations.STATE
JOIN ageBySate ON casesByState.STATE = ageBySate.STATE
GROUP BY statePopulations.STATE
ORDER BY CASES_PER_100K DESC;
--Respiratory Ailments and COVID
--+-----------------------------------------------------------------------------------------------+--
-- Find State, Age-Range with highest numbers of individuals with any Respiratory Ailment.
WITH respiratoryAilments AS (
SELECT KEY_ID, AGE, AGE_RANGE, AILMENT_ALLERGY_SINUS_NASAL, AILMENT_ASTHMA,
AILMENT_BRONCHITIS_COPD_EMPHYSEMA, AILMENT_COUGH_AND_COLD, AILMENT_NASAL_CONGESTION,
AILMENT2_BRONCHITIS_CHRONIC_BRONCHITIS, AILMENT2_EMPHYSEMA, AILMENT2_NASAL_ALLERGIES,
STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_ALIMENTS"
WHERE COALESCE(
AILMENT2_ASTHMA,AILMENT2_BRONCHITIS_CHRONIC_BRONCHITIS,AILMENT2_COPD,
AILMENT2_EMPHYSEMA,AILMENT2_SINUSITIS,AILMENT2_SNORING,
AILMENT_ALLERGY_SINUS_NASAL,AILMENT_ASTHMA,AILMENT_BRONCHITIS_COPD_EMPHYSEMA,
AILMENT_CHRONIC_BRONCHITIS,AILMENT_COUGH_AND_COLD,AILMENT_GROUP_RESPIRATORY,
AILMENT_LUNG_DISEASE,AILMENT_RESPIRATORY_AILMENTS,AILMENT_SNORING) = 'Yes'
)
SELECT STATE, AGE_RANGE, COUNT(*) * 100 AS APPROXIMATE_COUNT
FROM respiratoryAilments
WHERE AGE_RANGE IS NOT NULL
GROUP BY STATE, AGE_RANGE
HAVING APPROXIMATE_COUNT >= 1000
ORDER BY APPROXIMATE_COUNT DESC;
--Respiratory Issues per 100k by State
WITH respiratoryAilments AS (
SELECT COUNT(*) * 100 AS APPROXIMATE_COUNT_RESPIRATORY_AILMENTS, STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT"
WHERE COALESCE(
AILMENT2_ASTHMA,AILMENT2_BRONCHITIS_CHRONIC_BRONCHITIS,AILMENT2_COPD,
AILMENT2_EMPHYSEMA,AILMENT2_SINUSITIS,AILMENT2_SNORING,
AILMENT_ALLERGY_SINUS_NASAL,AILMENT_ASTHMA,AILMENT_BRONCHITIS_COPD_EMPHYSEMA,
AILMENT_CHRONIC_BRONCHITIS,AILMENT_COUGH_AND_COLD,AILMENT_GROUP_RESPIRATORY,
AILMENT_LUNG_DISEASE,AILMENT_RESPIRATORY_AILMENTS,AILMENT_SNORING) = 'Yes'
GROUP BY STATE),
statePopulations AS (
SELECT STATE, POPULATION
FROM "DEMO"."PUBLIC"."SATE_POPULTATIONS")
SELECT statePopulations.STATE, SUM(APPROXIMATE_COUNT_RESPIRATORY_AILMENTS)/SUM(POPULATION)*100000 AS AILMENTS_PER_100K
FROM respiratoryAilments
JOIN statePopulations ON respiratoryAilments.STATE = statePopulations.STATE
GROUP BY statePopulations.STATE
ORDER BY AILMENTS_PER_100K DESC;
--100k Infection Rate by 100k Respitory Issues Rate by State
WITH respiratoryAilments AS (
SELECT COUNT(*) * 100 AS APPROXIMATE_COUNT_RESPIRATORY_AILMENTS, STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT"
WHERE COALESCE(
AILMENT2_ASTHMA,AILMENT2_BRONCHITIS_CHRONIC_BRONCHITIS,AILMENT2_COPD,
AILMENT2_EMPHYSEMA,AILMENT2_SINUSITIS,AILMENT2_SNORING,
AILMENT_ALLERGY_SINUS_NASAL,AILMENT_ASTHMA,AILMENT_BRONCHITIS_COPD_EMPHYSEMA,
AILMENT_CHRONIC_BRONCHITIS,AILMENT_COUGH_AND_COLD,AILMENT_GROUP_RESPIRATORY,
AILMENT_LUNG_DISEASE,AILMENT_RESPIRATORY_AILMENTS,AILMENT_SNORING) = 'Yes'
GROUP BY STATE),
statePopulations AS (
SELECT STATE, POPULATION
FROM "DEMO"."PUBLIC"."SATE_POPULTATIONS"),
casesByState AS (
SELECT SUM(JHU.CASES) AS CONFIRMED_CASES, JHU.PROVINCE_STATE AS STATE
FROM "STARSCHEMA_COVID19"."PUBLIC"."JHU_COVID_19" JHU
WHERE JHU.COUNTRY_REGION = 'United States' AND JHU.DATE = TO_DATE($DATE_VARIABLE) AND JHU.CASE_TYPE='Confirmed'
GROUP BY STATE)
SELECT statePopulations.STATE, SUM(APPROXIMATE_COUNT_RESPIRATORY_AILMENTS)/SUM(POPULATION)*100000 AS AILMENTS_PER_100K,
SUM(CONFIRMED_CASES)/SUM(POPULATION)*100000 AS CASES_PER_100K
FROM respiratoryAilments
JOIN statePopulations ON respiratoryAilments.STATE = statePopulations.STATE
JOIN casesByState ON respiratoryAilments.STATE = casesByState.STATE
GROUP BY statePopulations.STATE
ORDER BY AILMENTS_PER_100K DESC;
--100k Death Rate by 100k Respitory Issues Rate by State
WITH respiratoryAilments AS (
SELECT COUNT(*) * 100 AS APPROXIMATE_COUNT_RESPIRATORY_AILMENTS, STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT"
WHERE COALESCE(
AILMENT2_ASTHMA,AILMENT2_BRONCHITIS_CHRONIC_BRONCHITIS,AILMENT2_COPD,
AILMENT2_EMPHYSEMA,AILMENT2_SINUSITIS,AILMENT2_SNORING,
AILMENT_ALLERGY_SINUS_NASAL,AILMENT_ASTHMA,AILMENT_BRONCHITIS_COPD_EMPHYSEMA,
AILMENT_CHRONIC_BRONCHITIS,AILMENT_COUGH_AND_COLD,AILMENT_GROUP_RESPIRATORY,
AILMENT_LUNG_DISEASE,AILMENT_RESPIRATORY_AILMENTS,AILMENT_SNORING) = 'Yes'
GROUP BY STATE),
statePopulations AS (
SELECT STATE, POPULATION
FROM "DEMO"."PUBLIC"."SATE_POPULTATIONS"),
deathsByState AS (
SELECT SUM(JHU.CASES) AS DEATHS, JHU.PROVINCE_STATE AS STATE
FROM "STARSCHEMA_COVID19"."PUBLIC"."JHU_COVID_19" JHU
WHERE JHU.COUNTRY_REGION = 'United States' AND JHU.DATE = TO_DATE($DATE_VARIABLE) AND JHU.CASE_TYPE='Deaths'
GROUP BY STATE)
SELECT statePopulations.STATE, SUM(APPROXIMATE_COUNT_RESPIRATORY_AILMENTS)/SUM(POPULATION)*100000 AS AILMENTS_PER_100K,
SUM(DEATHS)/SUM(POPULATION)*100000 AS DEATHS_PER_100K
FROM respiratoryAilments
JOIN statePopulations ON respiratoryAilments.STATE = statePopulations.STATE
JOIN deathsByState ON respiratoryAilments.STATE = deathsByState.STATE
GROUP BY statePopulations.STATE
ORDER BY AILMENTS_PER_100K DESC;
--Income and COVID
--+-----------------------------------------------------------------------------------------------+--
--Income by State
SELECT AVG(CASE ESTIMATED_HOUSEHOLD_INCOME_3_DOT_0
WHEN 'A - Less than $15,000' THEN 5000
WHEN 'B - $15,000-$19,999' THEN 15000
WHEN 'C - $20,000-$29,999' THEN 20000
WHEN 'D - $30,000-$39,999' THEN 30000
WHEN 'E - $40,000-$49,999' THEN 40000
WHEN 'F - $50,000-$59,999' THEN 50000
WHEN 'G - $60,000-$74,999' THEN 60000
WHEN 'H - $75,000-$99,999' THEN 75000
WHEN 'I - $100,000-$124,999' THEN 100000
WHEN 'J - $125,000-$149,999' THEN 125000
WHEN 'K - $150,000-$199,999' THEN 150000
WHEN 'L - $200,000-$249,999' THEN 200000
WHEN 'M - $250,000-$399,999' THEN 250000
WHEN 'N - $400,000-$499,999' THEN 400000
WHEN 'O - $500,000+' THEN 500000
ELSE 0
END) AS APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME, STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT"
GROUP BY STATE
ORDER BY APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME DESC;
--100k Infections by Avg(Income) by State
--Graph Scatter with Infection Rate on Y-Axis and Avg(Income) on X-Axis
WITH incomeByState AS (
SELECT AVG(CASE ESTIMATED_HOUSEHOLD_INCOME_3_DOT_0
WHEN 'A - Less than $15,000' THEN 5000
WHEN 'B - $15,000-$19,999' THEN 15000
WHEN 'C - $20,000-$29,999' THEN 20000
WHEN 'D - $30,000-$39,999' THEN 30000
WHEN 'E - $40,000-$49,999' THEN 40000
WHEN 'F - $50,000-$59,999' THEN 50000
WHEN 'G - $60,000-$74,999' THEN 60000
WHEN 'H - $75,000-$99,999' THEN 75000
WHEN 'I - $100,000-$124,999' THEN 100000
WHEN 'J - $125,000-$149,999' THEN 125000
WHEN 'K - $150,000-$199,999' THEN 150000
WHEN 'L - $200,000-$249,999' THEN 200000
WHEN 'M - $250,000-$399,999' THEN 250000
WHEN 'N - $400,000-$499,999' THEN 400000
WHEN 'O - $500,000+' THEN 500000
ELSE 0
END) AS APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME, STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT"
GROUP BY STATE),
statePopulations AS (
SELECT STATE, POPULATION
FROM "DEMO"."PUBLIC"."SATE_POPULTATIONS"),
casesByState AS (
SELECT SUM(JHU.CASES) AS CONFIRMED_CASES, JHU.PROVINCE_STATE AS STATE
FROM "STARSCHEMA_COVID19"."PUBLIC"."JHU_COVID_19" JHU
WHERE JHU.COUNTRY_REGION = 'United States' AND JHU.DATE = TO_DATE($DATE_VARIABLE) AND JHU.CASE_TYPE='Confirmed'
GROUP BY STATE)
SELECT statePopulations.STATE, SUM(CONFIRMED_CASES)/SUM(POPULATION)*100000 AS CASES_PER_100K,
AVG(APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME) AS APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME
FROM incomeByState
JOIN statePopulations ON incomeByState.STATE = statePopulations.STATE
JOIN casesByState ON incomeByState.STATE = casesByState.STATE
GROUP BY statePopulations.STATE
ORDER BY APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME DESC;
--100k Deaths by Avg(Income) by State
--Graph Scatter with Death Rate on Y-Axis and Avg(Income) on X-Axis
WITH incomeByState AS (
SELECT AVG(CASE ESTIMATED_HOUSEHOLD_INCOME_3_DOT_0
WHEN 'A - Less than $15,000' THEN 5000
WHEN 'B - $15,000-$19,999' THEN 15000
WHEN 'C - $20,000-$29,999' THEN 20000
WHEN 'D - $30,000-$39,999' THEN 30000
WHEN 'E - $40,000-$49,999' THEN 40000
WHEN 'F - $50,000-$59,999' THEN 50000
WHEN 'G - $60,000-$74,999' THEN 60000
WHEN 'H - $75,000-$99,999' THEN 75000
WHEN 'I - $100,000-$124,999' THEN 100000
WHEN 'J - $125,000-$149,999' THEN 125000
WHEN 'K - $150,000-$199,999' THEN 150000
WHEN 'L - $200,000-$249,999' THEN 200000
WHEN 'M - $250,000-$399,999' THEN 250000
WHEN 'N - $400,000-$499,999' THEN 400000
WHEN 'O - $500,000+' THEN 500000
ELSE 0
END) AS APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME, STATE
FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT"
GROUP BY STATE),
statePopulations AS (
SELECT STATE, POPULATION
FROM "DEMO"."PUBLIC"."SATE_POPULTATIONS"),
casesByState AS (
SELECT SUM(JHU.CASES) AS DEATHS, JHU.PROVINCE_STATE AS STATE
FROM "STARSCHEMA_COVID19"."PUBLIC"."JHU_COVID_19" JHU
WHERE JHU.COUNTRY_REGION = 'United States' AND JHU.DATE = TO_DATE($DATE_VARIABLE) AND JHU.CASE_TYPE='Deaths'
GROUP BY STATE)
SELECT statePopulations.STATE, SUM(DEATHS)/SUM(POPULATION)*100000 AS DEATHS_PER_100K,
AVG(APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME) AS APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME
FROM incomeByState
JOIN statePopulations ON incomeByState.STATE = statePopulations.STATE
JOIN casesByState ON incomeByState.STATE = casesByState.STATE
GROUP BY statePopulations.STATE
ORDER BY APPROXIMATE_STATE_AVG_ESTIMATED_HOUSEHOLD_INCOME DESC;
--Desired Future SQL
--States Ready to Breakout based on Recovery Index(Pent up Financial Demand)
--What's in the Data?
DESC TABLE "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT";
SELECT * FROM TABLE(RESULT_SCAN(LAST_QUERY_ID())) WHERE "name" LIKE '%INCOME%';
SELECT * FROM "WUNDERMAN"."DOD_DECODE_SAMPLE_CURRENT"."DECODE_STD_ROYALTY_PREMIUM_PLUS_ALIMENT" LIMIT 10;

Conclusion
As you can see for yourself the Snowflake Data Marketplace is an amazing place where data enthusiasts are coming together to further enhance data as actionable and meaningful to your narrative. With just clicks of a mouse, we were able to query 4 different data sources with COVID-19 cases, weather, population and demographic insights across the globe. I highly encourage playing around in the marketplace. Happy (Data Marketplace) Querying!