Improved features and interoperability for  SQLMetrics

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
Usecase:
1. Connecting DataFusion metrics to existing state of the art metrics collection systems (prometheus, influxdb, opentelemetry)
2. Gaining access to real time values of metrics (not just snapshots), for visualization as well as real time plan adjustments

When running plans with multiple operators of the same type (e.g. multiple `HashJoinExec`) there us currently be no way to programmatically gain access to the individual HashJoin's statistics. For example, in https://github.com/apache/arrow-datafusion/pull/662 the interface will return metrics with a single string like `inputRows`. For the usecase of printing metrics per operator, that is just fine, however for usecases across the whole plan it is less fine. 

It is currently very awkward to create metrics with finer granularity (such as per partition of the hash join or  parquet metrics per file). The current string = metric  interface means you have to make a compound string key (such as "metrics for file foo") and then have to parse that key (as I did in https://github.com/apache/arrow-datafusion/pull/657)

Other metric systems such as prometheus, opentelemetry, and influxdb, allow for name=value pairs on each metric to address these problems. So when trying to integrate DataFusion metrics, it will be a challenge to integrate with these other systems 

The other systems allow you to get access to things like:

```
operator=ParquetExec,filename="my_filename",partition_number=0 rows_scanned=100
operator=ParquetExec,filename="my_other_filename",partition_number=0 rows_scanned=200
```

or for hash join
```
operator=HashJoin,partition_number=0 rows_scanned=100
operator=HashJoin,partition_number=1 rows_scanned=200
```

Another challenge with the current metrics interface is that despite using `Arc` and atomic counters internally, the only external interface is to get a snapshot of the metrics. If it were to return the `Arc`s themselves, we could implement interactive visualizations showing how the metrics evolved over time.

**Describe the solution you'd like**

This, I propose the following changes to metrics to create them (with name/vale pairs)

```rust
let metrics = SQLMetric::counter("numRows")
  .with("partition", 1)
  .with("filename", "my_file");

let sub_metric = SQLMetric:counter("otherMetric")
  // inherit all name/value pairs on `metric`
  .with_family(metrics)
  .with("new_detail", "awesome");
```

And then the collection interface should return a list of these metrics rather than HashSet with the id. For example, rather than


```rust
pub trait ExecutionPlan {
...

    /// Return a snapshot of the metrics collected during execution
    fn metrics(&self) -> HashMap<String, SQLMetric> {
        HashMap::new()
    }
```

Something like

```rust
pub trait ExecutionPlan {
...

    /// Return the metrics for this execution
    fn metrics(&self) -> Vec<Arc<SQLMetric>> {
        HashMap::new()
    }
```

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improved features and interoperability for SQLMetrics #679

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improved features and interoperability for SQLMetrics #679

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions