DataBlend leverages Presto SQL (also known as Trino) and the query builder is capable of using most Presto functions. The only functions that will not be supported are those that are not support by Athena which is the AWS engine that DataBlend leverages. This page is intended to provide visibility into some popular functions and notes on how they work.
regexp_like
Description: Evaluates the regular expression pattern and determines if it is contained within string. Can be used synonymously with CONTAINS. Will return TRUE if found, else FALSE
Example: Match entries that start with 4/5/6/7/8/9. This is often used when you want to isolate GL accounts that start with 5 or higher so that you can apply specific logic to Expense Accounts.
Use ^[456789]{1}.* for regex expression.
Example: Match entries that start with words Richmond, Dogwood, Springfield. This can be used when you want to filter out levels with the names that start with listed words.
Use ^(Word1|Word2|Word3).* for regex expression.
Additional resources can be found: regex101: build, test, and debug regex
regexp_replace first occurrence up until and including “ - “:
To be pasted as second argument: ^(.*?) -
Replace non-latin characters
To be pasted as second argument: [^\a-zA-Z]. For numbers and whitespaces use this [^a-zA-Z0-9\s]
Replace white characters (CR LF → space)
Firstly you need to find ASCII decimal value for the characters to be replaced. Refer to the table below (Dec)
Call chr function on integer to replace the characters.
Use date parameters as filter in a Query
Parameters in MM/DD/YYYY format (string), date in source in M/D/YYYY format. Add in Where tab:
Parse date parameters:
-
String - %m/%d/%Y
-
RelativeDate - %m/%d/%Y %T
Important: When comparing date values of different granularity keep in mind that:
MM/YYYY will be parsed to timestamp MM/01/YYYY00:00:00, so comparison should be done withing the range MM/YYYY <= Date >= MM/last_day_of_month/YYYY.
Last Day of Month:
"DATE_ADD"('day', -1, "DATE_ADD"('month', 1, "date_trunc"('month', "date"("date_format"("date_parse"(CAST("Date" AS varchar), '%m/%d/%Y'), '%Y-%m-%d')))))
DataBlend Platform implementation example:
%m_%Y -> %m/Last_Day/%Y
Last Day of Prior Month:
"DATE_ADD"('day', -1, "date_trunc"('month', "date"("date_format"("date_parse"(CAST("Date" AS varchar), '%m/%d/%Y'), '%Y-%m-%d'))))
JSON string from multiple records
JSON string is required for importing line items as part of journal entries in NetSuite. Use array_agg function.
Parsing RAW JSON
Simple JSON:
We can handle JSON with the use of the "json_extract_scalar" function. JSON that is NOT nested can simply be handled by using json_extract_scalar. It accepts 2 arguments which is the JSON data itself, and then the path of the JSON. The default output from the function converts the JSON to type VARCHAR.
Note: The path number can be changed to alter the level of JSON you wish to select from.
Example → $[0].Local OR $[1].Local OR $[2].Local
RESULTS:
Nested JSON:
To handle nested JSON we will use CROSS JOIN and the UNNEST function so that each “local” value is assigned its own row.
FROM Clause -
Step 1 – In the From Clause utilize the Blue Cog to insert a Cross Join and Join above your current data set that contains the nested JSON.
Step 2 - Select Aliased Relation (Give it an alias, “Unnest” for example) and Relation to Unnest.
Step 3 - Add Expression -> Select Cast
Step 4 - Add 2nd Expression -> Select Function Call and use “json_parse” (Do not include the quotations)
Step 5 - Using Argument 1 add Dereference Expression -> Identifier -> Assign the JSON data column
Step 6 – In the Type Field enter “ARRAY(map(varchar,json))” (Do not include the quotations)
Step 7 - In Column Names use the same name as your assigned JSON data.
SELECT Clause -
Step 1 – Select Type Single Column, and assign any name.
Step 2 - Add a Function Call Expression and use “json_extract_scalar” (Do not include the quotations)
Step 3 – In Argument 1 select Subscript Expression -> Set Base to Dereference Expression -> Parent as Identifier and for Value use the Aliased name of the CROSS JOIN (“Unnest” was used in this demo) -> Set Field to the Column Name you assigned (“CurrencyCode” was used in Demo) -> Set Index to String Literal and Value of “local” (Because this is the KEY field within the JSON we wish to target).
Step 4 – Add Argument 2 as String Literal -> Set Value to “$” (Do not include the quotations)
RESULTS: