Skip to content

Commit

Permalink
Unit testing feature branch pull request (dbt-labs#8411)
Browse files Browse the repository at this point in the history
* Initial implementation of unit testing (from pr dbt-labs#2911)

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>

* 8295 unit testing artifacts (dbt-labs#8477)

* unit test config: tags & meta (dbt-labs#8565)

* Add additional functional test for unit testing selection, artifacts, etc (dbt-labs#8639)

* Enable inline csv format in unit testing (dbt-labs#8743)

* Support unit testing incremental models (dbt-labs#8891)

* update unit test key: unit -> unit-tests (dbt-labs#8988)


* convert to use unit test name at top level key (dbt-labs#8966)

* csv file fixtures (dbt-labs#9044)

* Unit test support for `state:modified` and `--defer` (dbt-labs#9032)

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>

* Allow use of sources as unit testing inputs (dbt-labs#9059)

* Use daff for diff formatting in unit testing (dbt-labs#8984)

* Fix dbt-labs#8652: Use seed file from disk for unit testing if rows not specified in YAML config (dbt-labs#9064)

Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Fix dbt-labs#8652: Use seed value if rows not specified

* Move unit testing to test and build commands (dbt-labs#9108)

* Enable unit testing in non-root packages (dbt-labs#9184)

* convert test to data_test (dbt-labs#9201)

* Make fixtures files full-fledged members of manifest and enable partial parsing (dbt-labs#9225)

* In build command run unit tests before models (dbt-labs#9273)

---------

Co-authored-by: Michelle Ark <michelle.ark@dbtlabs.com>
Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
Co-authored-by: Emily Rockman <emily.rockman@dbtlabs.com>
Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
  • Loading branch information
6 people authored Jan 16, 2024
1 parent 15704ab commit b5a0c4c
Show file tree
Hide file tree
Showing 167 changed files with 12,171 additions and 4,083 deletions.
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20230802-145011.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Initial implementation of unit testing
time: 2023-08-02T14:50:11.391992-04:00
custom:
Author: gshank
Issue: "8287"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20230828-101825.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Unit test manifest artifacts and selection
time: 2023-08-28T10:18:25.958929-04:00
custom:
Author: gshank
Issue: "8295"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20230906-234741.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Support config with tags & meta for unit tests
time: 2023-09-06T23:47:41.059915-04:00
custom:
Author: michelleark
Issue: "8294"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20230928-163205.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Enable inline csv fixtures in unit tests
time: 2023-09-28T16:32:05.573776-04:00
custom:
Author: gshank
Issue: "8626"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231101-101845.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Support unit testing incremental models
time: 2023-11-01T10:18:45.341781-04:00
custom:
Author: michelleark
Issue: "8422"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231106-194752.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Add support of csv file fixtures to unit testing
time: 2023-11-06T19:47:52.501495-06:00
custom:
Author: emmyoop
Issue: "8290"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231107-231006.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Unit tests support --defer and state:modified
time: 2023-11-07T23:10:06.376588-05:00
custom:
Author: jtcohen6
Issue: "8517"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231111-191150.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Support source inputs in unit tests
time: 2023-11-11T19:11:50.870494-05:00
custom:
Author: gshank
Issue: "8507"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231114-101555.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Use daff to render diff displayed in stdout when unit test fails
time: 2023-11-14T10:15:55.689307-05:00
custom:
Author: michelleark
Issue: "8558"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231116-144006.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Move unit testing to test command
time: 2023-11-16T14:40:06.121336-05:00
custom:
Author: gshank
Issue: "8979"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231130-130948.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Support unit tests in non-root packages
time: 2023-11-30T13:09:48.206007-05:00
custom:
Author: gshank
Issue: "8285"
7 changes: 7 additions & 0 deletions .changes/unreleased/Features-20231205-131717.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
kind: Features
body: Convert the `tests` config to `data_tests` in both dbt_project.yml and schema files.
in schema files.
time: 2023-12-05T13:17:17.647765-06:00
custom:
Author: emmyoop
Issue: "8699"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231205-200447.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: Make fixture files full-fledged parts of the manifest and enable partial parsing
time: 2023-12-05T20:04:47.117029-05:00
custom:
Author: gshank
Issue: "9067"
6 changes: 6 additions & 0 deletions .changes/unreleased/Features-20231212-150556.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Features
body: In build command run unit tests before models
time: 2023-12-12T15:05:56.778829-05:00
custom:
Author: gshank
Issue: "9128"
6 changes: 6 additions & 0 deletions .changes/unreleased/Fixes-20231113-154535.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Fixes
body: Use seed file from disk for unit testing if rows not specified in YAML config
time: 2023-11-13T15:45:35.008565Z
custom:
Author: aranke
Issue: "8652"
6 changes: 6 additions & 0 deletions .changes/unreleased/Under the Hood-20230912-190506.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Under the Hood
body: Add unit testing functional tests
time: 2023-09-12T19:05:06.023126-04:00
custom:
Author: gshank
Issue: "8512"
2 changes: 1 addition & 1 deletion core/dbt/adapters/base/relation.py
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ def add_ephemeral_prefix(name: str):
def create_ephemeral_from(
cls: Type[Self],
relation_config: RelationConfig,
limit: Optional[int],
limit: Optional[int] = None,
) -> Self:
# Note that ephemeral models are based on the name.
identifier = cls.add_ephemeral_prefix(relation_config.name)
Expand Down
374 changes: 185 additions & 189 deletions core/dbt/adapters/events/adapter_types_pb2.py

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,31 @@
{{ "limit " ~ limit if limit != none }}
) dbt_internal_test
{%- endmacro %}


{% macro get_unit_test_sql(main_sql, expected_fixture_sql, expected_column_names) -%}
{{ adapter.dispatch('get_unit_test_sql', 'dbt')(main_sql, expected_fixture_sql, expected_column_names) }}
{%- endmacro %}

{% macro default__get_unit_test_sql(main_sql, expected_fixture_sql, expected_column_names) -%}
-- Build actual result given inputs
with dbt_internal_unit_test_actual AS (
select
{% for expected_column_name in expected_column_names %}{{expected_column_name}}{% if not loop.last -%},{% endif %}{%- endfor -%}, {{ dbt.string_literal("actual") }} as actual_or_expected
from (
{{ main_sql }}
) _dbt_internal_unit_test_actual
),
-- Build expected result
dbt_internal_unit_test_expected AS (
select
{% for expected_column_name in expected_column_names %}{{expected_column_name}}{% if not loop.last -%}, {% endif %}{%- endfor -%}, {{ dbt.string_literal("expected") }} as actual_or_expected
from (
{{ expected_fixture_sql }}
) _dbt_internal_unit_test_expected
)
-- Union actual and expected results
select * from dbt_internal_unit_test_actual
union all
select * from dbt_internal_unit_test_expected
{%- endmacro %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{%- materialization unit, default -%}

{% set relations = [] %}

{% set expected_rows = config.get('expected_rows') %}
{% set tested_expected_column_names = expected_rows[0].keys() if (expected_rows | length ) > 0 else get_columns_in_query(sql) %} %}

{%- set target_relation = this.incorporate(type='table') -%}
{%- set temp_relation = make_temp_relation(target_relation)-%}
{% do run_query(get_create_table_as_sql(True, temp_relation, get_empty_subquery_sql(sql))) %}
{%- set columns_in_relation = adapter.get_columns_in_relation(temp_relation) -%}
{%- set column_name_to_data_types = {} -%}
{%- for column in columns_in_relation -%}
{%- do column_name_to_data_types.update({column.name: column.dtype}) -%}
{%- endfor -%}

{% set unit_test_sql = get_unit_test_sql(sql, get_expected_sql(expected_rows, column_name_to_data_types), tested_expected_column_names) %}

{% call statement('main', fetch_result=True) -%}

{{ unit_test_sql }}

{%- endcall %}

{% do adapter.drop_relation(temp_relation) %}

{{ return({'relations': relations}) }}

{%- endmaterialization -%}
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
{% macro get_fixture_sql(rows, column_name_to_data_types) %}
-- Fixture for {{ model.name }}
{% set default_row = {} %}

{%- if not column_name_to_data_types -%}
{%- set columns_in_relation = adapter.get_columns_in_relation(this) -%}
{%- set column_name_to_data_types = {} -%}
{%- for column in columns_in_relation -%}
{%- do column_name_to_data_types.update({column.name: column.dtype}) -%}
{%- endfor -%}
{%- endif -%}

{%- if not column_name_to_data_types -%}
{{ exceptions.raise_compiler_error("Not able to get columns for unit test '" ~ model.name ~ "' from relation " ~ this) }}
{%- endif -%}

{%- for column_name, column_type in column_name_to_data_types.items() -%}
{%- do default_row.update({column_name: (safe_cast("null", column_type) | trim )}) -%}
{%- endfor -%}

{%- for row in rows -%}
{%- do format_row(row, column_name_to_data_types) -%}
{%- set default_row_copy = default_row.copy() -%}
{%- do default_row_copy.update(row) -%}
select
{%- for column_name, column_value in default_row_copy.items() %} {{ column_value }} AS {{ column_name }}{% if not loop.last -%}, {%- endif %}
{%- endfor %}
{%- if not loop.last %}
union all
{% endif %}
{%- endfor -%}

{%- if (rows | length) == 0 -%}
select
{%- for column_name, column_value in default_row.items() %} {{ column_value }} AS {{ column_name }}{% if not loop.last -%},{%- endif %}
{%- endfor %}
limit 0
{%- endif -%}
{% endmacro %}


{% macro get_expected_sql(rows, column_name_to_data_types) %}

{%- if (rows | length) == 0 -%}
select * FROM dbt_internal_unit_test_actual
limit 0
{%- else -%}
{%- for row in rows -%}
{%- do format_row(row, column_name_to_data_types) -%}
select
{%- for column_name, column_value in row.items() %} {{ column_value }} AS {{ column_name }}{% if not loop.last -%}, {%- endif %}
{%- endfor %}
{%- if not loop.last %}
union all
{% endif %}
{%- endfor -%}
{%- endif -%}

{% endmacro %}

{%- macro format_row(row, column_name_to_data_types) -%}

{#-- wrap yaml strings in quotes, apply cast --#}
{%- for column_name, column_value in row.items() -%}
{% set row_update = {column_name: column_value} %}
{%- if column_value is string -%}
{%- set row_update = {column_name: safe_cast(dbt.string_literal(column_value), column_name_to_data_types[column_name]) } -%}
{%- elif column_value is none -%}
{%- set row_update = {column_name: safe_cast('null', column_name_to_data_types[column_name]) } -%}
{%- else -%}
{%- set row_update = {column_name: safe_cast(column_value, column_name_to_data_types[column_name]) } -%}
{%- endif -%}
{%- do row.update(row_update) -%}
{%- endfor -%}

{%- endmacro -%}
2 changes: 1 addition & 1 deletion core/dbt/adapters/relation_configs/config_change.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ class RelationConfigChangeAction(StrEnum):
drop = "drop"


@dataclass(frozen=True, eq=True, unsafe_hash=True)
@dataclass(frozen=True, eq=True, unsafe_hash=True) # type: ignore
class RelationConfigChange(RelationConfigBase, ABC):
action: RelationConfigChangeAction
context: Hashable # this is usually a RelationConfig, e.g. IndexConfig, but shouldn't be limited
Expand Down
20 changes: 20 additions & 0 deletions core/dbt/clients/jinja.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,26 @@ def __call__(self, *args, **kwargs):
return self.call_macro(*args, **kwargs)


class UnitTestMacroGenerator(MacroGenerator):
# this makes UnitTestMacroGenerator objects callable like functions
def __init__(
self,
macro_generator: MacroGenerator,
call_return_value: Any,
) -> None:
super().__init__(
macro_generator.macro,
macro_generator.context,
macro_generator.node,
macro_generator.stack,
)
self.call_return_value = call_return_value

def __call__(self, *args, **kwargs):
with self.track_call():
return self.call_return_value


# performance note: Local benmcharking (so take it with a big grain of salt!)
# on this indicates that it is is on average slightly slower than
# checking two separate patterns, but the standard deviation is smaller with
Expand Down
26 changes: 21 additions & 5 deletions core/dbt/compilation.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,11 @@
from dbt.flags import get_flags
from dbt.adapters.factory import get_adapter
from dbt.clients import jinja
from dbt.context.providers import (
generate_runtime_model_context,
generate_runtime_unit_test_context,
)
from dbt_common.clients.system import make_directory
from dbt.context.providers import generate_runtime_model_context
from dbt.contracts.graph.manifest import Manifest, UniqueID
from dbt.contracts.graph.nodes import (
ManifestNode,
Expand All @@ -21,6 +24,8 @@
GraphMemberNode,
InjectedCTE,
SeedNode,
UnitTestNode,
UnitTestDefinition,
)
from dbt.exceptions import (
GraphDependencyNotFoundError,
Expand All @@ -43,7 +48,8 @@
def print_compile_stats(stats):
names = {
NodeType.Model: "model",
NodeType.Test: "test",
NodeType.Test: "data test",
NodeType.Unit: "unit test",
NodeType.Snapshot: "snapshot",
NodeType.Analysis: "analysis",
NodeType.Macro: "macro",
Expand Down Expand Up @@ -91,6 +97,7 @@ def _generate_stats(manifest: Manifest):
stats[NodeType.Macro] += len(manifest.macros)
stats[NodeType.Group] += len(manifest.groups)
stats[NodeType.SemanticModel] += len(manifest.semantic_models)
stats[NodeType.Unit] += len(manifest.unit_tests)

# TODO: should we be counting dimensions + entities?

Expand Down Expand Up @@ -128,7 +135,7 @@ class Linker:
def __init__(self, data=None) -> None:
if data is None:
data = {}
self.graph = nx.DiGraph(**data)
self.graph: nx.DiGraph = nx.DiGraph(**data)

def edges(self):
return self.graph.edges()
Expand Down Expand Up @@ -191,6 +198,8 @@ def link_graph(self, manifest: Manifest):
self.link_node(exposure, manifest)
for metric in manifest.metrics.values():
self.link_node(metric, manifest)
for unit_test in manifest.unit_tests.values():
self.link_node(unit_test, manifest)
for saved_query in manifest.saved_queries.values():
self.link_node(saved_query, manifest)

Expand Down Expand Up @@ -234,6 +243,7 @@ def add_test_edges(self, manifest: Manifest) -> None:
# Get all tests that depend on any upstream nodes.
upstream_tests = []
for upstream_node in upstream_nodes:
# This gets tests with unique_ids starting with "test."
upstream_tests += _get_tests_for_node(manifest, upstream_node)

for upstream_test in upstream_tests:
Expand Down Expand Up @@ -291,8 +301,10 @@ def _create_node_context(
manifest: Manifest,
extra_context: Dict[str, Any],
) -> Dict[str, Any]:

context = generate_runtime_model_context(node, self.config, manifest)
if isinstance(node, UnitTestNode):
context = generate_runtime_unit_test_context(node, self.config, manifest)
else:
context = generate_runtime_model_context(node, self.config, manifest)
context.update(extra_context)

if isinstance(node, GenericTestNode):
Expand Down Expand Up @@ -460,6 +472,7 @@ def compile(self, manifest: Manifest, write=True, add_test_edges=False) -> Graph
summaries["_invocation_id"] = get_invocation_id()
summaries["linked"] = linker.get_graph_summary(manifest)

# This is only called for the "build" command
if add_test_edges:
manifest.build_parent_and_child_maps()
linker.add_test_edges(manifest)
Expand Down Expand Up @@ -526,6 +539,9 @@ def compile_node(
the node's raw_code into compiled_code, and then calls the
recursive method to "prepend" the ctes.
"""
if isinstance(node, UnitTestDefinition):
return node

# Make sure Lexer for sqlparse 0.4.4 is initialized
from sqlparse.lexer import Lexer # type: ignore

Expand Down
Loading

0 comments on commit b5a0c4c

Please sign in to comment.