Query Functions

DataBlend leverages Presto SQL (also known as Trino) and the query builder is capable of using most Presto functions. The only functions that will not be supported are those that are not support by Athena which is the AWS engine that DataBlend leverages. This page is intended to provide visibility into some popular functions and notes on how they work.

regexp_like

Description: Evaluates the regular expression pattern and determines if it is contained within string. Can be used synonymously with CONTAINS. Will return TRUE if found, else FALSE

Example: Match entries that start with 4/5/6/7/8/9. This is often used when you want to isolate GL accounts that start with 5 or higher so that you can apply specific logic to Expense Accounts.
Use ^[456789]{1}.* for regex expression.

image-20220627-161647

Example: Match entries that start with words Richmond, Dogwood, Springfield. This can be used when you want to filter out levels with the names that start with listed words.
Use ^(Word1|Word2|Word3).* for regex expression.

image-20230324-142558
 

Additional resources can be found: regex101: build, test, and debug regex

regexp_replace first occurrence up until and including “ - “:

To be pasted as second argument: ^(.*?) -

image-20220727-142835

Replace non-latin characters

To be pasted as second argument: [^\a-zA-Z]. For numbers and whitespaces use this [^a-zA-Z0-9\s]

image-20221221-202815 (1)

Replace white characters (CR LF → space)

Firstly you need to find ASCII decimal value for the characters to be replaced. Refer to the table below (Dec)

asciifull (1)

Call chr function on integer to replace the characters.

1 (2)
2 (3)
3 (1)

Use date parameters as filter in a Query

Parameters in MM/DD/YYYY format (string), date in source in M/D/YYYY format. Add in Where tab:

image-20221201-192640

Parse date parameters:

  • String - %m/%d/%Y

  • RelativeDate - %m/%d/%Y %T

Important: When comparing date values of different granularity keep in mind that:
MM/YYYY will be parsed to timestamp MM/01/YYYY00:00:00, so comparison should be done withing the range MM/YYYY <= Date >= MM/last_day_of_month/YYYY.

 

Last Day of Month:

"DATE_ADD"('day', -1, "DATE_ADD"('month', 1, "date_trunc"('month', "date"("date_format"("date_parse"(CAST("Date" AS varchar), '%m/%d/%Y'), '%Y-%m-%d')))))

DataBlend Platform implementation example:
%m_%Y -> %m/Last_Day/%Y

 
Picture1

Picture2

Picture3

Last Day of Prior Month:

"DATE_ADD"('day', -1, "date_trunc"('month', "date"("date_format"("date_parse"(CAST("Date" AS varchar), '%m/%d/%Y'), '%Y-%m-%d'))))

JSON string from multiple records

JSON string is required for importing line items as part of journal entries in NetSuite. Use array_agg function.

image-20230320-132450

Parsing RAW JSON

Simple JSON:

We can handle JSON with the use of the  "json_extract_scalar" function. JSON that is NOT nested can simply be handled by using json_extract_scalar. It accepts 2 arguments which is the JSON data itself, and then the path of the JSON. The default output from the function converts the JSON to type VARCHAR.

image-20231018-134429

Note: The path number can be changed to alter the level of JSON you wish to select from.

Example → $[0].Local OR $[1].Local OR $[2].Local

RESULTS:

image-20231018-135134

Nested JSON:

To handle nested JSON we will use CROSS JOIN and the UNNEST function so that each “local” value is assigned its own row.

FROM Clause -

Step 1 – In the From Clause utilize the Blue Cog to insert a Cross Join and Join above your current data set that contains the nested JSON.

image-20231018-135329

Step 2 -   Select Aliased Relation (Give it an alias, “Unnest” for example) and Relation to Unnest.

Step 3 -   Add Expression -> Select Cast

Step 4 -   Add 2nd Expression -> Select Function Call and use “json_parse” (Do not include the quotations)

Step 5 -  Using Argument 1 add Dereference Expression -> Identifier -> Assign the JSON data column

Step 6 – In the Type Field enter “ARRAY(map(varchar,json))” (Do not include the quotations)

Step 7 -  In Column Names use the same name as your assigned JSON data.

image-20231018-135356
image-20231018-135417

SELECT Clause -

Step 1 – Select Type Single Column, and assign any name.

Step 2  - Add a Function Call Expression and use “json_extract_scalar” (Do not include the quotations)

Step 3 – In Argument 1 select Subscript Expression -> Set Base to Dereference Expression -> Parent as Identifier and for Value use the Aliased name of the CROSS JOIN (“Unnest” was used in this demo) -> Set Field to the Column Name you assigned (“CurrencyCode” was used in Demo) -> Set Index to String Literal and Value of “local” (Because this is the KEY field within the JSON we wish to target).

Step 4 – Add Argument 2 as String Literal -> Set Value to “$” (Do not include the quotations)

 
image-20231018-140510

image-20231018-140536

RESULTS:

 

image-20231018-140822