Skip to content

Quickstart Guide

devkotasabin edited this page May 28, 2021 · 9 revisions

Welcome to the quick start guide for CFGConf to get you up and running. CFGConf is a JSON-based language for describing graph drawing requirements for Control Flow Graphs (CFGs). Based on the specified JSON, CFGConf produces graph visualization that matches your mental model and suited to the task at hand.

This guide introduces common ways CFGConf can be used. To learn how to run these files, refer to the getting started guide. For more details on the JSON language, refer to the reference guide.

Table of Contents

In this quickstart tutorial, we start with a minimal example and progressively showcase more features of CFGConf.

1. Minimal example containing only graph data

This is an example of a CFGConf JSON file that contains only the data related to the Control Flow Graph (CFG) without any rendering options. The data includes the nodes, edges, and loops for a graph that contains nested loops. The rendering options use default values which we describe below (see reference guide for detailed rendering options).

{
	"nodes": [
		{"id": "A"},
		{"id": "B"},
		{"id": "C"},
		{"id": "D"},
		{"id": "E"},
		{"id": "F"},
		{"id": "G"}
	],

	"edges": [
		{
			"source": "A",
			"target": "B"
		},
		{
			"source": "B",
			"target": "C"
		},
		{
			"source": "B",
			"target": "G"
		},
		{
			"source": "C",
			"target": "D"
		},
		{
			"source": "D",
			"target": "E"
		},
		{
			"source": "D",
			"target": "F"
		},
		{
			"source": "E",
			"target": "D"
		},
		{
			"source": "F",
			"target": "B"
		}
	],

	"loops": [
		{
			"backedges": [["F", "B"]], 
			"id": "loop_1",
                        "nodes": ["F", "D", "C", "E", "B"],
			"loops": [
				{
					"backedges": [["E", "D"]],
					"id": "loop_2",
		 			"nodes": ["E", "D"]
				} 
			]
		}
	]
}

This example contains only the required attributes inside the nodes, edges, and loops keyword to make it a valid JSON file in the CFGConf schema. Based on the default rendering options, the loops are drawn using the domain-specific algorithm from CFGExplorer. The loop background is drawn in orange color. The inner loop has a darker orange background. The back edges are larger in width and drawn in magenta color.

Here is the resulting visualization.

result

The nodes key contains an array of nodes. Each node is described as an object and has the property id. The graph in the above example has seven nodes "A", "B", "C", "D", "E", "F", "G".

The edges key contains an array of edges. Each edge is described as an object which has two properties source and target. Together, these two properties describe a directed edge that originates from the source node and ends at the target node. The keys contain the ids of the respective nodes.

The loops key contains an array of loops. The inner loops are always fully contained by the outer loop. The loop is described by an object with the properties id, nodes, and backedges respectively.

  • id contains the identifier for the loop. The id should be unique across all nodes, loops, and functions.
  • nodes contains an array of node ids that belong to the loop.
  • backedges contains an array of back edges associated with the nodes inside the loop. Each back edge is specified as a two-element array [source node id, target node id]. Back edges are edges within the loop with a target of the top or head of the loop. See Reference Guide: Back Edges for more information.

Note on Nested Loops: A loop object can also contain the loops property which contains all the loops nested inside the current loop. See the Reference Guide: Loops for more information on specifying nested loops.

2. Adding functions to the graph

This example shows you how to add functions to the graph. The functions key specifies array of functions in the graph. Each function is represented by an object with the properties:

  • id -- an identifier unique among all nodes, functions, and loops. Often this is the function name
  • nodes -- an array of node ids belonging to the function
{
	"nodes": [
		{"id": "A"},
		{"id": "B"},
		{"id": "C"},
		{"id": "D"},
		{"id": "E"},
		{"id": "F"},
		{"id": "G"}
	],

	"edges": [
		{
			"source": "A",
			"target": "B"
		},
		{
			"source": "A",
			"target": "C"
		},
		{
			"source": "B",
			"target": "G"
		},
		{
			"source": "C",
			"target": "D"
		},
		{
			"source": "C",
			"target": "E"
		},
		{
			"source": "D",
			"target": "F"
		},
		{
			"source": "E",
			"target": "F"
		},
		{
			"source": "F",
			"target": "G"
		}
	],

	"functions": [
		{
            "id": "main",
            "nodes": ["A", "B", "C", "D", "E", "F", "G"]
        }
	]
}

As with the previous example, we have not specified any rendering options so the default is used: nodes in a function are grouped together and enclosed in a blue rectangular boundary with the id used as the group label.

Here is the resulting drawing.

result

3. Loading data from dot file and structure file

This example shows how to load graph data from external files, such as a dot file, instead of entering the nodes and edges directly. This example has the same graph as the previous example, but using a dot format file and a separate file for loops and functions.

Three external files are supported in the data object:

  • graphFile - a dot containing nodes and edges. It may also contain rendering options for the graph
  • structureFile - a JSON file containing program structures (i.e., functions, loops) in the same format as above
  • analysisFile - a JSON file containing program structures (i.e., functions, loops) as generated by Dyninst

NOTE: Files listed inside the data section must be in the same directory as the CFGConf JSON file.

{
	"data": {
		"graphFile":"nestedif.dot",
		"structureFile":"nestedif_fns.json"
	}
}

Below is the dot file used for the graph.

digraph {
	A -> B;
	A -> C;
	B -> G;
	C -> D;
	C -> E;
	D -> F;
	E -> F;
	F -> G;
}

Below is the structure file containing the functions in the program.

{
	"functions": [
		{
            "id": "main",
            "nodes": ["A", "B", "C", "D", "E", "F", "G"]
        }
	]
}

This specification and set of files produces the same drawing as the drawing in the previous example.

4. Setting common rendering options

The example shows common rendering options. The rendering keyword inside the CFGConf JSON file includes styling of the nodes and edges, domain-specific options for loops, and visual hints for portions of the graph not drawn. This example focuses on rendering style like colors and shapes.

The data files used in this example are available at singlefile.dot and singlefile.structures.json.

{
	"data": {
		"graphFile":"singlefile.dot",
		"structureFile":"singlefile.structures.json"
	},

	"rendering": {
		"node": {
			"shape": "diamond",
			"style": "filled",
			"fillcolor": "#78d838",
			"fontcolor": "white",
			"label": "id"
		},
		"edge": {
			"style": "solid",
			"color": "black"
		},
		"loop": {
			"background": true
		},
		"function": {
			"boundary": true
		}
	}
}

The node, edge, loop, and function keys in the `rendering object set global style options for their respective structures.

Any style can be overridden for a specific node, edge, etc. in the individual specifications in the data file. After the CFGConf JSON file is processed, the program generates a final graph in dot format. Hence, CFGConf supports all the rendering options supported by dot graphviz. In addition, it supports additional options related to visualizing CFGs.

Common style properties used here include:

  • shape, which sets the node shape
  • style, which includes the fill and line styles
  • fillcolor, which sets the fill color
  • fontcolor, which sets the text color
  • label, which sets whether to use id or full labels from the node definitions

This example also:

  • sets the loop's background property to true, the default, which colors the spanning area behind all node in a loop in orange. Nesting is shown by darker orange colors.
  • sets the function's boundary property to true, which draws an enclosing rectangle around the function and labels it. This will also affect the layout, forcing the function nodes closer together.

Here is the resulting visualization:

Here is the result

5. Filtering large graphs

The example shows the use of filtering options. Filtering is beneficial for focusing on a subgraph of interest, such as a particular loop nest, function, or selected nodes and their neighborhood.

Filtering is also important for large graphs (i.e. graphs with more than a thousand nodes) as the graph layout can take a long time to solve or possibly even fail in these scenarios. Thus, we recommend filtering to a manageable size.

{
	"data": {
		"graphFile":"singlefile.dot",
		"structureFile":"singlefile.structures.json"
	},

	"rendering": {
		"node": {
			"label": "id"
		}
	},

	"filtering": {
		"isHopFilterOn":true, 
		"selectedNodes": ["B44"],
		"maxHops":4, 
		"minNodes":10, 
		"isLoopFilterOn": true
	}
}

The filtering keyword contains all filtering options. Filtering is mainly done by distance, or hops as described below. Thus, the isHopFilterOn key determines whether or not filtering is on.

The main logic of filtering is:

  1. Select nodes of interest/focus with the selectedNodes array. These comprise the initial subgraph.
  2. If isLoopFilterOn is true, the nodes of the loops to which the selected nodes belong are added.
  3. The subgraph is expanded to include any nodes reachable within maxHops of the nodes from the first two steps.

The minNodes and maxNodes keys can further tweak the size of the filtered graph, expanding or limiting the selection respectively when maxHops retrieves to few or too many nodes.s

The resulting visualization for the above is shown below. Note this is the same graph as in the previous rendering options example, but filtered around node "B44", which is thus highlighted with a teal border.

result

Boundary Nodes (ellipses)

We use the term boundary nodes to refer to the nodes which are just outside the filtered graph, e.g., connected to the nodes in the filtered graph but not inside. These nodes are shown as small ellipses for context. Multiple boundary nodes attached to a filtered node are depicted as stacked ellipses.

6. Collapsing functions

This example demonstrates the function collapsing feature. This feature heuristically alters the graph appearance, removing clutter induced by frequently called functions (e.g., utility functions) that may not be of interest to the analysis. They drawing is altered as follows:

  • Groups of nodes belonging to identified functions are collapsed into a single node.
  • Collapsed nodes are duplicated to remove tangles of edges.
  • Collapsed nodes are drawn in a de-emphasized style.

The collapsingRules key turns this feature on and allows adjusting the heuristics for collapsing functions.

{
	"data": {
		"graphFile":"singlefile.dot",
		"structureFile":"singlefile.structures.json"
	},

	"rendering": {
		"node": {
			"label": "id"
		},
		"function": {
			"collapsingRules": {
				"isCollapsingEnabled": true,
				"minIncomingEdges": 4,
				"minOutgoingEdges": 0,
				"alwaysCollapseList": [],
				"neverCollapseList": [],
				"maxCollapseSize": "25p" 
			}
		}
	}
}

Collapsing rules apply to the portion of the graph to be drawn, i.e., the subgraph determined by the filtering rules. Functions in this subgraph containing loops will not be collapsed.

The following options determine which functions to collapse:

  • minIncomingEdges - functions with at least this many incoming (call) edges will be collapsed.
  • minOutgoingEdges - functions with at least this many outgoing (return) edges will be collapsed.
  • maxCollapseSize - functions of this size will not be collapsed. This can be specified as a number of nodes or a percentage of the drawn subgraph by appending a p, e.g. a value of "5p" equals 5 percent of nodes in the visible graph.
  • alwaysCollapseList - functions that should always be collapsed regardless of the other rules
  • neverCollapseList - functions that should never be collapsed regardless of the other rules

Here is the result of the above specification.

Here is the result

To inspect the figure, right click on the image and select Open Image in New Tab. Compared to the figure in the quick start example setting common rendering options, the functions puts and printf are collapsed.

The collapsed nodes are drawn with dotted boundaries. The label for the node is the id of the function that is collapsed. Hovering over the collapsed node in the web app will result in the number of nodes belonging to this function being displayed.

Clone this wiki locally