Skip to content

Commit

Permalink
Added new macros array_append and array_construct and integration tes…
Browse files Browse the repository at this point in the history
…ts (#595)

* Added new macros array_append and create_array and integration tests

* updated casting to use type_string

* joined list togeter

* updated functionality to only allow for single data type in array to be compatible with postgres

* updated test_array_append

* replaced {} with [] for matching in postgres

* fixed redshift

* Added cast_array_to_string macro

* updated postgres cast_array_to_string

* change name of create_array macro to array_construct

* add array_concat macro and integration tests

* updated readme and added empty array_contruct to intergation tests

* added default array data_type for postgres func

* added data_type arg

* postgres fix

* fix variable name

* changed default array data type to integer

* updated default data_type to use api.Column.translate_type
  • Loading branch information
graciegoheen authored Jun 14, 2022
1 parent b4931fe commit ffddc49
Show file tree
Hide file tree
Showing 12 changed files with 249 additions and 3 deletions.
58 changes: 55 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,10 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this
- [last_day](#last_day-source)
- [width_bucket](#width_bucket-source)
- [listagg](#listagg-source)
- [array_construct](#array_construct-source)
- [array_append](#array_append-source)
- [array_concat](#array_concat-source)
- [cast_array_to_string](#cast_array_to_string-source)

- [Jinja Helpers](#jinja-helpers)
- [pretty_time](#pretty_time-source)
Expand Down Expand Up @@ -992,7 +996,7 @@ Boolean values are replaced with the strings 'true'|'false'
| 2017-03-01 | processing | size | S |
| 2017-03-01 | processing | color | red |

**Args**:
**Args:**
- `relation`: The [Relation](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation) to unpivot.
- `cast_to`: The data type to cast the unpivoted values to, default is varchar
- `exclude`: A list of columns to exclude from the unpivot operation but keep in the resulting table.
Expand Down Expand Up @@ -1056,7 +1060,7 @@ This macro calculates the difference between two dates.
#### split_part ([source](macros/cross_db_utils/split_part.sql))
This macro splits a string of text using the supplied delimiter and returns the supplied part number (1-indexed).

**Args**:
**Args:**
- `string_text` (required): Text to be split into parts.
- `delimiter_text` (required): Text representing the delimiter to split by.
- `part_number` (required): Requested part of the split (1-based). If the value is negative, the parts are counted backward from the end of the string.
Expand Down Expand Up @@ -1115,7 +1119,7 @@ When an expression falls outside the range, the function returns:
#### listagg ([source](macros/cross_db_utils/listagg.sql))
This macro returns the concatenated input values from a group of rows separated by a specified deliminator.

**Args**:
**Args:**
- `measure` (required): The expression (typically a column name) that determines the values to be concatenated. To only include distinct values add keyword DISTINCT to beginning of expression (example: 'DISTINCT column_to_agg').
- `delimiter_text` (required): Text representing the delimiter to separate concatenated values by.
- `order_by_clause` (optional): An expression (typically a column name) that determines the order of the concatenated values.
Expand All @@ -1128,6 +1132,54 @@ Note: If there are instances of `delimiter_text` within your `measure`, you cann
{{ dbt_utils.listagg(measure='column_to_agg', delimiter_text="','", order_by_clause="order by order_by_column", limit_num=10) }}
```

#### array_construct ([source](macros/cross_db_utils/array_construct.sql))
This macro returns an array constructed from a set of inputs.

**Args:**
- `inputs` (optional): The list of array contents. If not provided, this macro will create an empty array. All inputs must be the *same data type* in order to match Postgres functionality and *not null* to match Bigquery functionality.
- `data_type` (optional): Specifies the data type of the constructed array. This is only relevant when creating an empty array (will otherwise use the data type of the inputs). If `inputs` are `data_type` are both not provided, this macro will create an empty array of type integer.

**Usage:**
```
{{ dbt_utils.array_construct(['column_1', 'column_2', 'column_3']) }}
{{ dbt_utils.array_construct([],'integer') }}
```

#### array_append ([source](macros/cross_db_utils/array_append.sql))
This macro appends an element to the end of an array and returns the appended array.

**Args:**
- `array` (required): The array to append to.
- `new_element` (required): The element to be appended. This element must *match the data type of the existing elements* in the array in order to match Postgres functionality and *not null* to match Bigquery functionality.

**Usage:**
```
{{ dbt_utils.array_append('array_column', 'element_column') }}
```

#### array_concat ([source](macros/cross_db_utils/array_concat.sql))
This macro returns the concatenation of two arrays.

**Args:**
- `array_1` (required): The array to append to.
- `array_2` (required): The array to be appended to `array_1`. This array must match the data type of `array_1` in order to match Postgres functionality.

**Usage:**
```
{{ dbt_utils.array_concat('array_column_1', 'array_column_2') }}
```

#### cast_array_to_string ([source](macros/cross_db_utils/cast_array_to_string.sql))
This macro converts an array to a single string value and returns the resulting string.

**Args:**
- `array` (required): The array to convert to a string.

**Usage:**
```
{{ dbt_utils.cast_array_to_string('array_column') }}
```

---
### Jinja Helpers
#### pretty_time ([source](macros/jinja_helpers/pretty_time.sql))
Expand Down
2 changes: 2 additions & 0 deletions integration_tests/data/cross_db/data_array_append.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
array_as_string,element,result_as_string
"[1,2,3]",4,"[1,2,3,4]"
2 changes: 2 additions & 0 deletions integration_tests/data/cross_db/data_array_concat.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
array_1_as_string,array_2_as_string,result_as_string
"[1,2,3]","[4,5,6]","[1,2,3,4,5,6]"
3 changes: 3 additions & 0 deletions integration_tests/data/cross_db/data_array_construct.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
num_input_1,num_input_2,num_input_3,result_as_string
1,2,3,"[1,2,3]"
4,5,6,"[4,5,6]"
18 changes: 18 additions & 0 deletions integration_tests/models/cross_db_utils/schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,24 @@ models:
- dbt_utils.equality:
compare_model: ref('data_any_value_expected')

- name: test_array_append
tests:
- assert_equal:
actual: actual
expected: expected

- name: test_array_concat
tests:
- assert_equal:
actual: actual
expected: expected

- name: test_array_construct
tests:
- assert_equal:
actual: actual
expected: expected

- name: test_bool_or
tests:
- dbt_utils.equality:
Expand Down
29 changes: 29 additions & 0 deletions integration_tests/models/cross_db_utils/test_array_append.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
with data as (

select
data_array_append.element,
data_array_append.result_as_string,
data_array_construct.num_input_1,
data_array_construct.num_input_2,
data_array_construct.num_input_3
from {{ ref('data_array_append') }} as data_array_append
left join {{ ref('data_array_construct') }} as data_array_construct
on data_array_append.array_as_string = data_array_construct.result_as_string

),

appended_array as (

select
{{ dbt_utils.array_append(dbt_utils.array_construct(['num_input_1', 'num_input_2', 'num_input_3']), 'element') }} as array_actual,
result_as_string as expected
from data

)

-- we need to cast the arrays to strings in order to compare them to the output in our seed file
select
array_actual,
{{ dbt_utils.cast_array_to_string('array_actual') }} as actual,
expected
from appended_array
36 changes: 36 additions & 0 deletions integration_tests/models/cross_db_utils/test_array_concat.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
with data as (

select
data_array_concat.result_as_string,
array_1.num_input_1 as array_1_num_input_1,
array_1.num_input_2 as array_1_num_input_2,
array_1.num_input_3 as array_1_num_input_3,
array_2.num_input_1 as array_2_num_input_1,
array_2.num_input_2 as array_2_num_input_2,
array_2.num_input_3 as array_2_num_input_3
from {{ ref('data_array_concat') }} as data_array_concat
left join {{ ref('data_array_construct') }} as array_1
on data_array_concat.array_1_as_string = array_1.result_as_string
left join {{ ref('data_array_construct') }} as array_2
on data_array_concat.array_2_as_string = array_2.result_as_string

),

concat_array as (

select
{{ dbt_utils.array_concat(
dbt_utils.array_construct(['array_1_num_input_1', 'array_1_num_input_2', 'array_1_num_input_3']),
dbt_utils.array_construct(['array_2_num_input_1', 'array_2_num_input_2', 'array_2_num_input_3'])
) }} as array_actual,
result_as_string as expected
from data

)

-- we need to cast the arrays to strings in order to compare them to the output in our seed file
select
array_actual,
{{ dbt_utils.cast_array_to_string('array_actual') }} as actual,
expected
from concat_array
27 changes: 27 additions & 0 deletions integration_tests/models/cross_db_utils/test_array_construct.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
with data as (

select * from {{ ref('data_array_construct') }}

),

array_construct as (
select
{{ dbt_utils.array_construct(['num_input_1', 'num_input_2', 'num_input_3']) }} as array_actual,
result_as_string as expected

from data

union all

select
{{ dbt_utils.array_construct() }} as array_actual,
'[]' as expected

)

-- we need to cast the arrays to strings in order to compare them to the output in our seed file
select
array_actual,
{{ dbt_utils.cast_array_to_string('array_actual') }} as actual,
expected
from array_construct
16 changes: 16 additions & 0 deletions macros/cross_db_utils/array_append.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{% macro array_append(array, new_element) -%}
{{ return(adapter.dispatch('array_append', 'dbt_utils')(array, new_element)) }}
{%- endmacro %}

{# new_element must be the same data type as elements in array to match postgres functionality #}
{% macro default__array_append(array, new_element) -%}
array_append({{ array }}, {{ new_element }})
{%- endmacro %}

{% macro bigquery__array_append(array, new_element) -%}
{{ dbt_utils.array_concat(array, dbt_utils.array_construct([new_element])) }}
{%- endmacro %}

{% macro redshift__array_append(array, new_element) -%}
{{ dbt_utils.array_concat(array, dbt_utils.array_construct([new_element])) }}
{%- endmacro %}
15 changes: 15 additions & 0 deletions macros/cross_db_utils/array_concat.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{% macro array_concat(array_1, array_2) -%}
{{ return(adapter.dispatch('array_concat', 'dbt_utils')(array_1, array_2)) }}
{%- endmacro %}

{% macro default__array_concat(array_1, array_2) -%}
array_cat({{ array_1 }}, {{ array_2 }})
{%- endmacro %}

{% macro bigquery__array_concat(array_1, array_2) -%}
array_concat({{ array_1 }}, {{ array_2 }})
{%- endmacro %}

{% macro redshift__array_concat(array_1, array_2) -%}
array_concat({{ array_1 }}, {{ array_2 }})
{%- endmacro %}
24 changes: 24 additions & 0 deletions macros/cross_db_utils/array_construct.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{% macro array_construct(inputs = [], data_type = api.Column.translate_type('integer')) -%}
{{ return(adapter.dispatch('array_construct', 'dbt_utils')(inputs, data_type)) }}
{%- endmacro %}

{# all inputs must be the same data type to match postgres functionality #}
{% macro default__array_construct(inputs, data_type) -%}
{% if inputs|length > 0 %}
array[ {{ inputs|join(' , ') }} ]
{% else %}
array[]::{{data_type}}[]
{% endif %}
{%- endmacro %}

{% macro snowflake__array_construct(inputs, data_type) -%}
array_construct( {{ inputs|join(' , ') }} )
{%- endmacro %}

{% macro redshift__array_construct(inputs, data_type) -%}
array( {{ inputs|join(' , ') }} )
{%- endmacro %}

{% macro bigquery__array_construct(inputs, data_type) -%}
[ {{ inputs|join(' , ') }} ]
{%- endmacro %}
22 changes: 22 additions & 0 deletions macros/cross_db_utils/cast_array_to_string.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{% macro cast_array_to_string(array) %}
{{ adapter.dispatch('cast_array_to_string', 'dbt_utils') (array) }}
{% endmacro %}

{% macro default__cast_array_to_string(array) %}
cast({{ array }} as {{ dbt_utils.type_string() }})
{% endmacro %}

{# when casting as array to string, postgres uses {} (ex: {1,2,3}) while other dbs use [] (ex: [1,2,3]) #}
{% macro postgres__cast_array_to_string(array) %}
{%- set array_as_string -%}cast({{ array }} as {{ dbt_utils.type_string() }}){%- endset -%}
{{ dbt_utils.replace(dbt_utils.replace(array_as_string,"'}'","']'"),"'{'","'['") }}
{% endmacro %}

{# redshift should use default instead of postgres #}
{% macro redshift__cast_array_to_string(array) %}
cast({{ array }} as {{ dbt_utils.type_string() }})
{% endmacro %}

{% macro bigquery__cast_array_to_string(array) %}
'['||(select string_agg(cast(element as string), ',') from unnest({{ array }}) element)||']'
{% endmacro %}

0 comments on commit ffddc49

Please sign in to comment.