Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(#4884) Add support for ratio metrics #5027

Merged
merged 32 commits into from
Jul 6, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
72cad80
wip
drewbanin Mar 16, 2022
cf23d65
Merge branch 'main' of github.com:fishtown-analytics/dbt into feature…
drewbanin Apr 7, 2022
d43e859
Merge branch 'main' into feature/metric-improvements
drewbanin Apr 8, 2022
8dcfeb1
More support for ratio metrics
drewbanin Apr 11, 2022
aae1c81
Formatting and linting
drewbanin Apr 13, 2022
4d4198c
Fix unit tests
drewbanin Apr 14, 2022
a57d3b0
Support disabling metrics
drewbanin Apr 14, 2022
395393e
mypy
drewbanin Apr 14, 2022
55e7ab7
address all TODOs
drewbanin Apr 14, 2022
a9e839e
make pypy happy
drewbanin Apr 14, 2022
3a8385b
Merge branch 'main' into feature/metric-improvements
drewbanin May 31, 2022
db35e88
wip
drewbanin Jun 2, 2022
d6e886c
checkpoint
drewbanin Jun 3, 2022
7af4c51
refactor, remove ratio_terms
drewbanin Jun 3, 2022
32c3c53
flake8 and unit tests
drewbanin Jun 3, 2022
0344a20
remove debugger
drewbanin Jun 8, 2022
bee3eea
quickfix for filters
drewbanin Jun 9, 2022
934605e
Experiment with functional testing for 'expression' metrics
jtcohen6 Jun 16, 2022
06eb926
reformatting slightly
callum-mcdata Jun 28, 2022
4461d7a
make file and mypy fix
emmyoop Jun 29, 2022
31cfa7f
remove config from metrics - wip
emmyoop Jun 29, 2022
60ef314
add metrics back to context
emmyoop Jun 29, 2022
cd957a6
adding test changes
callum-mcdata Jun 29, 2022
ba3a78c
fixing test metrics
callum-mcdata Jun 29, 2022
54c38d0
revert name audit
emmyoop Jun 29, 2022
7b03989
Merge branch 'feature/metric-improvements' of https://github.com/dbt-…
emmyoop Jun 29, 2022
46679e2
pre-commit fixes
emmyoop Jun 29, 2022
b8a7f99
add changelog
emmyoop Jul 5, 2022
e235ab7
Bumping manifest version to v6 (#5430)
leahwicz Jul 5, 2022
c28628f
Merge branch 'main' of https://github.com/dbt-labs/dbt-core into feat…
emmyoop Jul 5, 2022
68f7708
add v5 to backwards compatibility
emmyoop Jul 5, 2022
e808272
Clean up test_previous_version_state, update for v6 (#5440)
jtcohen6 Jul 6, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions core/dbt/compilation.py
Original file line number Diff line number Diff line change
Expand Up @@ -397,6 +397,8 @@ def link_node(self, linker: Linker, node: GraphMemberNode, manifest: Manifest):
linker.dependency(node.unique_id, (manifest.nodes[dependency].unique_id))
elif dependency in manifest.sources:
linker.dependency(node.unique_id, (manifest.sources[dependency].unique_id))
elif dependency in manifest.metrics:
linker.dependency(node.unique_id, (manifest.metrics[dependency].unique_id))
else:
dependency_not_found(node, dependency)

Expand Down
93 changes: 92 additions & 1 deletion core/dbt/context/providers.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@
ParsedSeedNode,
ParsedSourceDefinition,
)
from dbt.contracts.graph.metrics import (
MetricReference, ResolvedMetricReference
)
from dbt.exceptions import (
CompilationException,
ParsingException,
Expand Down Expand Up @@ -197,7 +200,7 @@ def Relation(self):
return self.db_wrapper.Relation

@abc.abstractmethod
def __call__(self, *args: str) -> Union[str, RelationProxy]:
def __call__(self, *args: str) -> Union[str, RelationProxy, MetricReference]:
pass


Expand Down Expand Up @@ -263,6 +266,42 @@ def __call__(self, *args: str) -> RelationProxy:
return self.resolve(args[0], args[1])


class BaseMetricResolver(BaseResolver):
def resolve(self, name: str, package: Optional[str] = None) -> MetricReference:
...

def _repack_args(self, name: str, package: Optional[str]) -> List[str]:
if package is None:
return [name]
else:
return [package, name]

def validate_args(self, name: str, package: Optional[str]):
if not isinstance(name, str):
raise CompilationException(
f"The name argument to metric() must be a string, got " f"{type(name)}"
emmyoop marked this conversation as resolved.
Show resolved Hide resolved
)

if package is not None and not isinstance(package, str):
raise CompilationException(
f"The package argument to metric() must be a string or None, got " f"{type(package)}"
)

def __call__(self, *args: str) -> MetricReference:
name: str
package: Optional[str] = None

if len(args) == 1:
name = args[0]
elif len(args) == 2:
package, name = args
else:
# TODO: Use a different error!
ref_invalid_args(self.model, args)
self.validate_args(name, package)
return self.resolve(name, package)


class Config(Protocol):
def __init__(self, model, context_config: Optional[ContextConfig]):
...
Expand Down Expand Up @@ -509,6 +548,35 @@ def resolve(self, source_name: str, table_name: str):
return self.Relation.create_from_source(target_source)


# metric` implementations
class ParseMetricResolver(BaseMetricResolver):
def resolve(self, name: str, package: Optional[str] = None) -> MetricReference:
self.model.metrics.append(self._repack_args(name, package))

return MetricReference(name, package)


class RuntimeMetricResolver(BaseMetricResolver):
def resolve(self, target_name: str, target_package: Optional[str] = None) -> MetricReference:
target_metric = self.manifest.resolve_metric(
target_name,
target_package,
self.current_project,
self.model.package_name,
)

if target_metric is None or isinstance(target_metric, Disabled):
# TODO : Use a different exception!!
ref_target_not_found(
self.model,
target_name,
target_package,
disabled=isinstance(target_metric, Disabled),
)

return ResolvedMetricReference(target_metric)


# `var` implementations.
class ModelConfiguredVar(Var):
def __init__(
Expand Down Expand Up @@ -566,6 +634,7 @@ class Provider(Protocol):
Var: Type[ModelConfiguredVar]
ref: Type[BaseRefResolver]
source: Type[BaseSourceResolver]
metric: Type[BaseMetricResolver]


class ParseProvider(Provider):
Expand All @@ -575,6 +644,7 @@ class ParseProvider(Provider):
Var = ParseVar
ref = ParseRefResolver
source = ParseSourceResolver
metric = ParseMetricResolver


class GenerateNameProvider(Provider):
Expand All @@ -584,6 +654,7 @@ class GenerateNameProvider(Provider):
Var = RuntimeVar
ref = ParseRefResolver
source = ParseSourceResolver
metric = ParseMetricResolver


class RuntimeProvider(Provider):
Expand All @@ -593,6 +664,7 @@ class RuntimeProvider(Provider):
Var = RuntimeVar
ref = RuntimeRefResolver
source = RuntimeSourceResolver
metric = RuntimeMetricResolver


class OperationProvider(RuntimeProvider):
Expand Down Expand Up @@ -776,6 +848,10 @@ def ref(self) -> Callable:
def source(self) -> Callable:
return self.provider.source(self.db_wrapper, self.model, self.config, self.manifest)

@contextproperty
def metric(self) -> Callable:
return self.provider.metric(self.db_wrapper, self.model, self.config, self.manifest)

@contextproperty("config")
def ctx_config(self) -> Config:
"""The `config` variable exists to handle end-user configuration for
Expand Down Expand Up @@ -1286,6 +1362,15 @@ def generate_runtime_macro_context(
return ctx.to_dict()


def generate_runtime_metric_context(
metric: ParsedMetric,
config: RuntimeConfig,
manifest: Manifest,
) -> Dict[str, Any]:
ctx = ProviderContext(metric, config, manifest, RuntimeProvider(), None)
return ctx.to_dict()


class ExposureRefResolver(BaseResolver):
def __call__(self, *args) -> str:
if len(args) not in (1, 2):
Expand Down Expand Up @@ -1362,6 +1447,12 @@ def generate_parse_metrics(
project,
manifest,
),
"metric": ParseMetricResolver(
None,
metric,
project,
manifest,
),
}


Expand Down
71 changes: 71 additions & 0 deletions core/dbt/contracts/graph/manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,39 @@ def perform_lookup(self, unique_id: UniqueID, manifest) -> ManifestNode:
return manifest.nodes[unique_id]


class MetricLookup(dbtClassMixin):
def __init__(self, manifest: "Manifest"):
self.storage: Dict[str, Dict[PackageName, UniqueID]] = {}
self.populate(manifest)

def get_unique_id(self, search_name, package: Optional[PackageName]):
return find_unique_id_for_package(self.storage, search_name, package)

def find(self, search_name, package: Optional[PackageName], manifest: "Manifest"):
unique_id = self.get_unique_id(search_name, package)
if unique_id is not None:
return self.perform_lookup(unique_id, manifest)
return None

def add_metric(self, metric: ParsedMetric):
if metric.search_name not in self.storage:
self.storage[metric.search_name] = {}

self.storage[metric.search_name][metric.package_name] = metric.unique_id

def populate(self, manifest):
for metric in manifest.metrics.values():
if hasattr(metric, "name"):
self.add_metric(metric)

def perform_lookup(self, unique_id: UniqueID, manifest: "Manifest") -> ParsedMetric:
if unique_id not in manifest.metrics:
raise dbt.exceptions.InternalException(
f"Metric {unique_id} found in cache but not found in manifest"
)
return manifest.metrics[unique_id]


# This handles both models/seeds/snapshots and sources
class DisabledLookup(dbtClassMixin):
def __init__(self, manifest: "Manifest"):
Expand Down Expand Up @@ -434,6 +467,9 @@ class Disabled(Generic[D]):
target: D


MaybeMetricNode = Optional[ParsedMetric]


MaybeDocumentation = Optional[ParsedDocumentation]


Expand Down Expand Up @@ -595,6 +631,9 @@ class Manifest(MacroMethods, DataClassMessagePackMixin, dbtClassMixin):
_ref_lookup: Optional[RefableLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
_metric_lookup: Optional[MetricLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
_disabled_lookup: Optional[DisabledLookup] = field(
default=None, metadata={"serialize": lambda x: None, "deserialize": lambda x: None}
)
Expand Down Expand Up @@ -833,6 +872,12 @@ def ref_lookup(self) -> RefableLookup:
self._ref_lookup = RefableLookup(self)
return self._ref_lookup

@property
def metric_lookup(self) -> MetricLookup:
if self._metric_lookup is None:
self._metric_lookup = MetricLookup(self)
return self._metric_lookup

def rebuild_ref_lookup(self):
self._ref_lookup = RefableLookup(self)

Expand Down Expand Up @@ -908,6 +953,31 @@ def resolve_source(
return Disabled(disabled[0])
return None

def resolve_metric(
self,
target_metric_name: str,
target_metric_package: Optional[str],
current_project: str,
node_package: str,
) -> MaybeMetricNode:
metric: Optional[ParsedMetric] = None
disabled: Optional[List[ParsedMetric]] = None

candidates = _search_packages(current_project, node_package, target_metric_package)
for pkg in candidates:
metric = self.metric_lookup.find(target_metric_name, pkg, self)
if metric is not None:
# Skip if the metric is disabled!
return metric

# TODO: Figure this part out...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we allow other yaml-only resources (like sources) to be disabled? If so, we should implement this part

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sources can be disabled, exposures cannot (but we should support this). In the past, it's only been possible to disable a source via the sources: block in dbt_project.yml.

In v1.1, we'll be adding the ability to disable a source inline with its definition (#5008):

sources:
  - name: my_source
    config:
      enabled: False

# if disabled is None:
# disabled = self.disabled_lookup.find(search_name, pkg)

if disabled:
return Disabled(disabled[0])
return None

# Called by DocsRuntimeContext.doc
def resolve_doc(
self,
Expand Down Expand Up @@ -1072,6 +1142,7 @@ def __reduce_ex__(self, protocol):
self._doc_lookup,
self._source_lookup,
self._ref_lookup,
self._metric_lookup,
self._disabled_lookup,
self._analysis_lookup,
)
Expand Down
30 changes: 30 additions & 0 deletions core/dbt/contracts/graph/metrics.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@

class MetricReference(object):
def __init__(self, metric_name, package_name=None):
self.metric_name = metric_name
self.package_name = package_name


class ResolvedMetricReference(MetricReference):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my take on a class analogous to the Relation object. I figured that this kind of object could be returned from the metric() function. It would help us avoid the need to find the metric in the graph manually like we do here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets remove .metric_value based on conversation here.

"""
Simple proxy over a ParsedMetric which delegates property
lookups to the underlying node. Also adds helper functions
for working with metrics (ie. __str__ and templating functions)
"""
def __init__(self, node):
super().__init__(node.name, node.package_name)
self.node = node

def __getattr__(self, key):
return getattr(self.node, key)

@property
def is_derived(self):
return self.node.is_derived

def qualify(self, namespace):
return f'{namespace}.{self.metric_name}'

def __str__(self):
return self.node.name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this return the metric name? Or the sql expression? Or something else? Not 100% sure yet....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would the ResolvedMetricReference get stringified?

Copy link
Contributor Author

@drewbanin drewbanin Apr 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would happen when you interpolate the results of the metric() function, so:

select
  {{ metric('revenue') }}
from {{ metric('revenue').model }}

would evaluate to

select
  revenue
from analytics.fct_revenue

So - returning the name of the metric is kind of useless, but I am unsure if we want to return anything more sophisticated than that? I guess we could return self.node.sql instead (which feels reasonable), but this doesn't extend well to the ratio metric type which does not have a .sql value

So, uh, short answer: you probably wouldn't stringify it in practice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we shouldn't even have a str method then? With the idea that it's better to fail noisily than fail silently.


Loading