Tableau & Snowflake: SQL UDFs and Workdays Example

Posted by

Doing functions like looping and dynamic sql functions in Tableau is hard or impossible.  Things like counting a number of workdays between 2 dates with a Tableau Function is incredibly difficult, but in SQL it is very simple.  You would just say count(holidays) between date start and date end. Fortunately Tableau allows passthrough SQL to do this exact type of operation.  This is something I first explored in 2012 for a Tableau Customer Conference 2012 session in San Diego. The session was called “Join Us at the Custom SQL Table” 

Raw SQL Pass Through
Tableau has a special class of functions called pass through.  I would recommend a browse through the documentation, but this is my general interpretation.

RAWSQLAGG_RETURNTYPE(“dbo.function(%1,%N)”,[arg1], [argN])

  • AGG will Aggregate leaving it out will execute at the row level
  • RETURNTYPE must be specified and match the data type of the SVF
  • %1, %N are aliases that will be replaced by the contents of the arguments
  • [arg1], [argN] are fields in Tableau, Constants or Parameters

Creating Dates and Holidays Tables in Snowflake
We will create a tbl_dates with month start and end for 2019 and 2020.  We will create a table of all 16 Federal holidays in the US for 2019 and 2020.

--Create Table of Dates
CREATE OR REPLACE TABLE TBL_DATES (
    ID Integer,
    DATE_START Date,
    DATE_END Date,
    INSERT_DTS Timestamp_ltz(6)
);

--Insert Dates 1/1/2019 to 12/31/2020
INSERT INTO TBL_DATES (ID, DATE_START, DATE_END, INSERT_DTS)
VALUES
  (1, '1/1/2019', '1/31/2019', current_timestamp::timestamp_ltz(6)),
  (2, '2/1/2019', '2/28/2019', current_timestamp::timestamp_ltz(6)),
  (3, '3/1/2019', '3/31/2019', current_timestamp::timestamp_ltz(6)),
  (4, '4/1/2019', '4/30/2019', current_timestamp::timestamp_ltz(6)),
  (5, '5/1/2019', '5/31/2019', current_timestamp::timestamp_ltz(6)),
  (6, '6/1/2019', '6/30/2019', current_timestamp::timestamp_ltz(6)),
  (7, '7/1/2019', '7/31/2019', current_timestamp::timestamp_ltz(6)),
  (8, '8/1/2019', '8/31/2019', current_timestamp::timestamp_ltz(6)),
  (9, '9/1/2019', '9/30/2019', current_timestamp::timestamp_ltz(6)),
  (10, '10/1/2019', '10/31/2019', current_timestamp::timestamp_ltz(6)),
  (11, '11/1/2019', '11/30/2019', current_timestamp::timestamp_ltz(6)),
  (12, '12/1/2019', '12/31/2019', current_timestamp::timestamp_ltz(6)),
  (13, '1/1/2020', '1/31/2020', current_timestamp::timestamp_ltz(6)),
  (14, '2/1/2020', '2/29/2020', current_timestamp::timestamp_ltz(6)),
  (15, '3/1/2020', '3/31/2020', current_timestamp::timestamp_ltz(6)),
  (16, '4/1/2020', '4/30/2020', current_timestamp::timestamp_ltz(6)),
  (17, '5/1/2020', '5/31/2020', current_timestamp::timestamp_ltz(6)),
  (18, '6/1/2020', '6/30/2020', current_timestamp::timestamp_ltz(6)),
  (19, '7/1/2020', '7/31/2020', current_timestamp::timestamp_ltz(6)),
  (20, '8/1/2020', '8/31/2020', current_timestamp::timestamp_ltz(6)),
  (21, '9/1/2020', '9/30/2020', current_timestamp::timestamp_ltz(6)),
  (22, '10/1/2020', '10/31/2020', current_timestamp::timestamp_ltz(6)),
  (23, '11/1/2020', '11/30/2020', current_timestamp::timestamp_ltz(6)),
  (24, '12/1/2020', '12/31/2020', current_timestamp::timestamp_ltz(6))
;

--Test the Dates Table
SELECT * FROM TBL_DATES;


--Holidays for 2019 and 2020
CREATE OR REPLACE TABLE TBL_HOLIDAYS (
    ID Integer,
    NAME_HOLIDAY String,
    DATE_HOLIDAY Date,
    INSERT_DTS Timestamp_ltz(6)
);

--Insert Holiday Dates 1/1/2019 to 12/31/2020
INSERT INTO TBL_HOLIDAYS (ID, NAME_HOLIDAY, DATE_HOLIDAY, INSERT_DTS)
VALUES
  (1, 'New Years Day', '1/1/2019', current_timestamp::timestamp_ltz(6)),
  (2, 'Martin Luther King Jr. Day', '1/21/2019', current_timestamp::timestamp_ltz(6)),
  (3, 'Memorial Day', '5/27/2019', current_timestamp::timestamp_ltz(6)),
  (4, 'Independence Day', '7/4/2019', current_timestamp::timestamp_ltz(6)),
  (5, 'Labor Day', '9/2/2019', current_timestamp::timestamp_ltz(6)),
  (6, 'Veterans Day', '11/11/2019', current_timestamp::timestamp_ltz(6)),
  (7, 'Thanksgiving', '11/28/2019', current_timestamp::timestamp_ltz(6)),
  (8, 'Christmas Day', '12/25/2019', current_timestamp::timestamp_ltz(6)),
  (9, 'New Years Day', '1/1/2020', current_timestamp::timestamp_ltz(6)),
  (10, 'Martin Luther King Jr. Day', '1/20/2020', current_timestamp::timestamp_ltz(6)),
  (11, 'Memorial Day', '5/25/2020', current_timestamp::timestamp_ltz(6)),
  (12, 'Independence Day', '7/3/2020', current_timestamp::timestamp_ltz(6)),
  (13, 'Labor Day', '9/7/2020', current_timestamp::timestamp_ltz(6)),
  (14, 'Veterans Day', '11/11/2020', current_timestamp::timestamp_ltz(6)),
  (15, 'Thanksgiving', '11/26/2020', current_timestamp::timestamp_ltz(6)),
  (16, 'Christmas Day', '12/25/2020', current_timestamp::timestamp_ltz(6))
;

--Test Holidays
SELECT * FROM TBL_HOLIDAYS;

Creating Holidays Count Functions
Now we can test counting holidays with a simple between statement and a SQL UDF.

--Test Select Between
SET DATE_START = TO_DATE('1/1/2019');
SET DATE_END = TO_DATE('3/7/2020');
SELECT COUNT(*) FROM TBL_HOLIDAYS WHERE DATE_HOLIDAY BETWEEN $DATE_START AND $DATE_END;

Creating a SQL UDF to return the count from a callable function and testing it.

--Create a UDF to Find Holiday Counts
CREATE OR REPLACE FUNCTION UDF_HOLIDAYS(DATESTART Date, DATEEND Date)
    RETURNS INTEGER
    AS
    $$
        SELECT COUNT(*) FROM TBL_HOLIDAYS WHERE DATE_HOLIDAY BETWEEN DATESTART AND DATEEND
    $$;

--Test UDF
SET DATE_START = TO_DATE('1/1/2019');
SET DATE_END = TO_DATE('3/7/2020');
SELECT UDF_HOLIDAYS($DATE_START, $DATE_END) AS COUNT_OF_HOLIDAYS;

--Test UDF with Select
SELECT *, UDF_HOLIDAYS(DATE_START, DATE_END) AS COUNT_OF_HOLIDAYS FROM TBL_DATES ORDER BY ID;

Creating Raw Snow SQL in Tableau
Let’s start putting it all together.  Open a new Tableau Workbook and connect to your Snowflake Account.  If you do not have a Snowflake Account you can sign up for a free trial here: https://trial.snowflake.com/.  Connect to the tbl_dates you create earlier and create a RAWSQLAGG_INT function to run the Between SQL.

RAWSQLAGG_INT("
    SELECT COUNT(*) 
    FROM TBL_HOLIDAYS 
    WHERE DATE_HOLIDAY BETWEEN %1 AND %2", 
    [Date Start], [Date End])

We can see the resulting nested SQL in the SELECT clause that Tableau has written from our RAW SQL function by looking at the Query History in Snowflake: https://docs.snowflake.net/manuals/user-guide/ui-history.html , or by looking at a Tableau Performance Recording: https://help.tableau.com/current/pro/desktop/en-us/perf_record_create_desktop.htm .

SELECT "TBL_DATES"."DATE_END" AS "DATE_END",
  "TBL_DATES"."DATE_START" AS "DATE_START",
  "TBL_DATES"."ID" AS "ID",
  SUM(1) AS "sum:Number of Records:ok",
  (SELECT COUNT(*) FROM TBL_HOLIDAYS WHERE DATE_HOLIDAY BETWEEN "TBL_DATES"."DATE_START" AND "TBL_DATES"."DATE_END") AS "usr:Calculation_6013712904378720256:ok"
FROM "PUBLIC"."TBL_DATES" "TBL_DATES"
GROUP BY 1,
  2,
  3

Connect Raw SQL in Tableau to UDFs in Snowflake
Create another RAWSQLAGG_INT function to run the SQL UDF_HOLIDAYS.

RAWSQLAGG_INT(
    "SELECT UDF_HOLIDAYS(%1, %2)",
    [Date Start], [Date End])

We can see the resulting nested SQL in the SELECT clause that Tableau has written from our RAW SQL UDF function by looking at the Query History in Snowflake: https://docs.snowflake.net/manuals/user-guide/ui-history.html , or by looking at a Tableau Performance Recording: https://help.tableau.com/current/pro/desktop/en-us/perf_record_create_desktop.htm .

SELECT "TBL_DATES"."DATE_END" AS "DATE_END",
  "TBL_DATES"."DATE_START" AS "DATE_START",
  "TBL_DATES"."ID" AS "ID",
  SUM(1) AS "sum:Number of Records:ok",
  (SELECT COUNT(*) FROM TBL_HOLIDAYS WHERE DATE_HOLIDAY BETWEEN "TBL_DATES"."DATE_START" AND "TBL_DATES"."DATE_END") AS "usr:Calculation_6013712904378720256:ok",
  (SELECT UDF_HOLIDAYS("TBL_DATES"."DATE_START", "TBL_DATES"."DATE_END")) AS "usr:Calculation_6013712904379215873:ok"
FROM "PUBLIC"."TBL_DATES" "TBL_DATES"
GROUP BY 1,
  2,
  3

Conclusion
Now you can leverage this custom function framework to do just about anything in SQL.  Pull in real time stock quotes, prices, inventory or any other dynamic sql function for random numbers to HLL to hitting restful web service and other functions that do not exist in Tableau.  Happy Querying!

6 comments

  1. I love the RAWSQL functions in Tableau, but unfortunately they don’t work when I publish the data source on Server. Do you know of any way to get it to work in that case?

    Like

  2. Nice explained UDF with example.

    Can you please explain with similar for UDTF?

    In my end, When i have one entry at masters table UDTF works well as expected. When Masters has more than one row it fails with the error –

    create or replace table Masters as
    select 1 Offer, 20200921 Week
    UNION ALL
    select 2 Marketing, 20200523 Week;

    CREATE OR REPLACE function GET_STOREVISITS(FISCAL_TIME_KEY string)
    returns table(STORE string, FISCAL_KEY string, STOREVISITS INT) as
    $$
    Select
    V.LOC_ID as STORE,
    FISCAL_TIME_KEY as FISCAL_KEY,
    sum(V.VISITS_CNT) as STOREVISITS
    from DAILY_VISITS_BY_STORE V
    inner join LOCATION L on L.LOC_ID = V.LOC_ID
    inner join FISCAL_CALENDAR T on T.FISCAL_DAY_KY = V.BSN_DT_KY
    Where
    V.LOC_ID ‘BUSINESS_UNIT_51_LOCN’
    AND T.FISCAL_DAY_KY = FISCAL_TIME_KEY
    Group BY STORE,FISCAL_KEY
    $$;

    select c.* from Masters y join table(GET_STOREVISITS(WEEK)) c
    where c.FISCAL_KEY = y.WEEK

    Error : SQL execution internal error: Processing aborted due to error 300010:391167117;

    Like

    1. Try explicitly casting v.loc_id as string, fiscal_time_key as string, and storevists as int.

      That error means there is a type mismatch from what you’ve defined in “returns table(…)” versus what is actually getting returned.

      Like

Leave a comment