Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ARROW-10531: [Rust][DataFusion]: Add schema and graphviz formatting f…
…or LogicalPlans and a PlanVisitor # Rationale: I have been tracking down potential issues DataFusion for my work project, and I have found myself wanting to print out the state of the logical_plan several times. The existing debug formatting is ok, but it was missing a few key items: 1. Schema information (as in when did columns appear / disappear in the plan) 2. A visual representation (graphviz) # Open questions: 1. Would it be better to split the visitor into `visitor.rs` and display code into `display.rs`? I am torn -- this is all logically part of logical_plan, but the module is getting kind of big. # Changes: This PR adds several additional formatting options to logical plans in addition to the existing indent. Examples are included below To do so it also provides a generalized "Visitor" pattern for walking logical plan nodes, as well as a general pattern to display logical plan nodes with multiple potential formats. Note it should be straight forward to get this wired up into EXPALIN as well: https://issues.apache.org/jira/browse/ARROW-9746 ## Existing Formatting Here is what master currently allows: ``` Projection: #id Filter: #state Eq Utf8(\"CO\")\ CsvScan: employee.csv projection=Some([0, 3]) ``` ## With Schema Information. This PR adds a dump with schema information: ``` Projection: #id [id:Int32]\ Filter: #state Eq Utf8(\"CO\") [id:Int32, state:Utf8]\ TableScan: employee.csv projection=Some([0, 3]) [id:Int32, state:Utf8]"; ``` ## As Graphviz This PR adds the ability to display plans using [Graphviz](http://www.graphviz.org) Here is an example GraphViz plan that comes out: ``` // Begin DataFusion GraphViz Plan (see https://graphviz.org) digraph { subgraph cluster_1 { graph[label="LogicalPlan"] 2[label="Projection: #id"] 3[label="Filter: #state Eq Utf8(_CO_)"] 2 -> 3 [arrowhead=none, arrowtail=normal, dir=back] 4[label="TableScan: employee.csv projection=Some([0, 3])"] 3 -> 4 [arrowhead=none, arrowtail=normal, dir=back] } subgraph cluster_5 { graph[label="Detailed LogicalPlan"] 6[label="Projection: #id\nSchema: [id:Int32]"] 7[label="Filter: #state Eq Utf8(_CO_)\nSchema: [id:Int32, state:Utf8]"] 6 -> 7 [arrowhead=none, arrowtail=normal, dir=back] 8[label="TableScan: employee.csv projection=Some([0, 3])\nSchema: [id:Int32, state:Utf8]"] 7 -> 8 [arrowhead=none, arrowtail=normal, dir=back] } } // End DataFusion GraphViz Plan ``` Here is what that looks like rendered: <img width="1679" alt="Screen Shot 2020-11-09 at 2 30 07 PM" src="https://user-images.githubusercontent.com/490673/98606322-0f891880-22b5-11eb-8e1c-669ce85f0f52.png"> Closes apache#8619 from alamb/alamb/improved-display Authored-by: alamb <andrew@nerdnetworks.org> Signed-off-by: alamb <andrew@nerdnetworks.org>
- Loading branch information