Skip to content

Commit

Permalink
ARROW-10531: [Rust][DataFusion]: Add schema and graphviz formatting f…
Browse files Browse the repository at this point in the history
…or LogicalPlans and a PlanVisitor

# Rationale:
I have been tracking down potential issues DataFusion for my work project, and I have found myself wanting to print out the state of the logical_plan several times. The existing debug formatting is ok, but it was missing a few key items:

1. Schema information (as in when did columns appear / disappear in the plan)
2. A visual representation (graphviz)

# Open questions:
1. Would it be better to split the visitor into `visitor.rs` and display code into `display.rs`? I am torn -- this is all logically part of logical_plan, but the module is getting kind of big.

# Changes:

This PR adds several additional formatting options to logical plans in addition to the existing indent. Examples are included below

To do so it also provides a generalized "Visitor" pattern for walking logical plan nodes, as well as a general pattern to display logical plan nodes with multiple potential formats.

Note it should be straight forward to get this wired up into EXPALIN as well: https://issues.apache.org/jira/browse/ARROW-9746

## Existing Formatting
Here is what master currently allows:

```
Projection: #id
   Filter: #state Eq Utf8(\"CO\")\
       CsvScan: employee.csv projection=Some([0, 3])
```

## With Schema Information.
This PR adds a dump with schema information:

```
 Projection: #id [id:Int32]\
    Filter: #state Eq Utf8(\"CO\") [id:Int32, state:Utf8]\
      TableScan: employee.csv projection=Some([0, 3]) [id:Int32, state:Utf8]";
```

## As Graphviz

This PR adds the ability to display plans using [Graphviz](http://www.graphviz.org)

Here is an example GraphViz plan that comes out:
```
// Begin DataFusion GraphViz Plan (see https://graphviz.org)
digraph {
  subgraph cluster_1
  {
    graph[label="LogicalPlan"]
    2[label="Projection: #id"]
    3[label="Filter: #state Eq Utf8(_CO_)"]
    2 -> 3 [arrowhead=none, arrowtail=normal, dir=back]
    4[label="TableScan: employee.csv projection=Some([0, 3])"]
    3 -> 4 [arrowhead=none, arrowtail=normal, dir=back]
  }
  subgraph cluster_5
  {
    graph[label="Detailed LogicalPlan"]
    6[label="Projection: #id\nSchema: [id:Int32]"]
    7[label="Filter: #state Eq Utf8(_CO_)\nSchema: [id:Int32, state:Utf8]"]
    6 -> 7 [arrowhead=none, arrowtail=normal, dir=back]
    8[label="TableScan: employee.csv projection=Some([0, 3])\nSchema: [id:Int32, state:Utf8]"]
    7 -> 8 [arrowhead=none, arrowtail=normal, dir=back]
  }
}
// End DataFusion GraphViz Plan
```

Here is what that looks like rendered:
<img width="1679" alt="Screen Shot 2020-11-09 at 2 30 07 PM" src="https://user-images.githubusercontent.com/490673/98606322-0f891880-22b5-11eb-8e1c-669ce85f0f52.png">

Closes apache#8619 from alamb/alamb/improved-display

Authored-by: alamb <andrew@nerdnetworks.org>
Signed-off-by: alamb <andrew@nerdnetworks.org>
  • Loading branch information
alamb authored and yordan-pavlov committed Nov 14, 2020
1 parent 1605cfb commit 8277416
Show file tree
Hide file tree
Showing 2 changed files with 903 additions and 94 deletions.
21 changes: 21 additions & 0 deletions rust/datafusion/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,3 +171,24 @@ Below is a checklist of what you need to do to add a new aggregate function to D
* a new line in `create_aggregate_expr` mapping the built-in to the implementation
* tests to the function.
* In [tests/sql.rs](tests/sql.rs), add a new test where the function is called through SQL against well known data and returns the expected result.

## How to display plans graphically

The query plans represented by `LogicalPlan` nodes can be graphically
rendered using [Graphviz](http://www.graphviz.org/).

To do so, save the output of the `display_graphviz` function to a file.:

```rust
// Create plan somehow...
let mut output = File::create("/tmp/plan.dot")?;
write!(output, "{}", plan.display_graphviz());
```

Then, use the `dot` command line tool to render it into a file that
can be displayed. For example, the following command creates a
`/tmp/plan.pdf` file:

```bash
dot -Tpdf < /tmp/plan.dot > /tmp/plan.pdf
```
Loading

0 comments on commit 8277416

Please sign in to comment.