-
Notifications
You must be signed in to change notification settings - Fork 2
Quickstart Guide
Welcome to the quick start guide for CFGConf to get you up and running. CFGConf is a JSON-based language for describing graph drawing requirements for Control Flow Graphs (CFGs). Based on the specified JSON, CFGConf produces graph visualization that matches your mental model and suited to the task at hand.
This guide introduces common ways CFGConf can be used. To learn how to run these files, refer to the getting started guide. For more details on the JSON language, refer to the reference guide.
Table of Contents
- 1. Minimal example containing only graph data
- 2. Adding functions to the graph
- 3. Loading data from dot file and structure file
- 4. Setting common rendering options
- 5. Filtering large graphs
- 6. Collapsing functions
In this quickstart tutorial, we start with a minimal example and progressively showcase more features of CFGConf.
This is an example of a CFGConf JSON file that contains only the data related to the Control Flow Graph (CFG) without any rendering options. The data includes the nodes
, edges
, and loops
for a graph that contains nested loops. The rendering options use default values which we describe below (see reference guide for detailed rendering options).
{
"nodes": [
{"id": "A"},
{"id": "B"},
{"id": "C"},
{"id": "D"},
{"id": "E"},
{"id": "F"},
{"id": "G"}
],
"edges": [
{
"source": "A",
"target": "B"
},
{
"source": "B",
"target": "C"
},
{
"source": "B",
"target": "G"
},
{
"source": "C",
"target": "D"
},
{
"source": "D",
"target": "E"
},
{
"source": "D",
"target": "F"
},
{
"source": "E",
"target": "D"
},
{
"source": "F",
"target": "B"
}
],
"loops": [
{
"backedges": [["F", "B"]],
"id": "loop_1",
"nodes": ["F", "D", "C", "E", "B"],
"loops": [
{
"backedges": [["E", "D"]],
"id": "loop_2",
"nodes": ["E", "D"]
}
]
}
]
}
This example contains only the required attributes inside the nodes
, edges
, and loops
keyword to make it a valid JSON file in the CFGConf schema. Based on the default rendering options, the loops are drawn using the domain-specific algorithm from CFGExplorer. The loop background is drawn in orange color. The inner loop has a darker orange background. The back edges are larger in width and drawn in magenta color.
Here is the resulting visualization.
The nodes
key contains an array of nodes. Each node is described as an object and has the property id
. The graph in the above example has seven nodes "A", "B", "C", "D", "E", "F", "G".
The edges
key contains an array of edges. Each edge is described as an object which has two properties source
and target
. Together, these two properties describe a directed edge that originates from the source node and ends at the target node. The keys contain the id
s of the respective nodes.
The loops
key contains an array of loops. The inner loops are always fully contained by the outer loop. The loop is described by an object with the properties id
, nodes
, and backedges
respectively.
-
id
contains the identifier for the loop. The id should be unique across all nodes, loops, and functions. -
nodes
contains an array of node ids that belong to the loop. -
backedges
contains an array of back edges associated with the nodes inside the loop. Each back edge is specified as a two-element array[source node id, target node id].
Back edges are edges within the loop with a target of the top or head of the loop. SeeReference Guide: Back Edges
for more information.
Note on Nested Loops: A loop object can also contain the loops
property which contains all the loops nested inside the current loop. See the Reference Guide: Loops
for more information on specifying nested loops.
This example shows you how to add functions to the graph. The functions
key specifies array of functions in the graph. Each function is represented by an object with the properties:
-
id
-- an identifier unique among all nodes, functions, and loops. Often this is the function name -
nodes
-- an array of node ids belonging to the function
{
"nodes": [
{"id": "A"},
{"id": "B"},
{"id": "C"},
{"id": "D"},
{"id": "E"},
{"id": "F"},
{"id": "G"}
],
"edges": [
{
"source": "A",
"target": "B"
},
{
"source": "A",
"target": "C"
},
{
"source": "B",
"target": "G"
},
{
"source": "C",
"target": "D"
},
{
"source": "C",
"target": "E"
},
{
"source": "D",
"target": "F"
},
{
"source": "E",
"target": "F"
},
{
"source": "F",
"target": "G"
}
],
"functions": [
{
"id": "main",
"nodes": ["A", "B", "C", "D", "E", "F", "G"]
}
]
}
As with the previous example, we have not specified any rendering options so the default is used: nodes in a function are grouped together and enclosed in a blue rectangular boundary with the id
used as the group label.
Here is the resulting drawing.
This example shows how to load graph data from external files, such as a dot file, instead of entering the nodes and edges directly. This example has the same graph as the previous example, but using a dot format file and a separate file for loops and functions.
Three external files are supported in the data
object:
-
graphFile
- a dot containing nodes and edges. It may also contain rendering options for the graph -
structureFile
- a JSON file containing program structures (i.e.,functions
,loops
) in the same format as above -
analysisFile
- a JSON file containing program structures (i.e.,functions
,loops
) as generated by Dyninst
NOTE: Files listed inside the data
section must be in the same directory as the CFGConf JSON file.
{
"data": {
"graphFile":"nestedif.dot",
"structureFile":"nestedif_fns.json"
}
}
Below is the dot file used for the graph.
digraph {
A -> B;
A -> C;
B -> G;
C -> D;
C -> E;
D -> F;
E -> F;
F -> G;
}
Below is the structure file containing the functions in the program.
{
"functions": [
{
"id": "main",
"nodes": ["A", "B", "C", "D", "E", "F", "G"]
}
]
}
This specification and set of files produces the same drawing as the drawing in the previous example.
The example shows common rendering options. The rendering
keyword inside the CFGConf JSON file includes styling of the nodes and edges, domain-specific options for loops, and visual hints for portions of the graph not drawn. This example focuses on rendering style like colors and shapes.
The data files used in this example are available at singlefile.dot and singlefile.structures.json.
{
"data": {
"graphFile":"singlefile.dot",
"structureFile":"singlefile.structures.json"
},
"rendering": {
"node": {
"shape": "diamond",
"style": "filled",
"fillcolor": "#78d838",
"fontcolor": "white",
"label": "id"
},
"edge": {
"style": "solid",
"color": "black"
},
"loop": {
"background": true
},
"function": {
"boundary": true
}
}
}
The node
, edge
, loop
, and function
keys in the `rendering object set global style options for their respective structures.
Any style can be overridden for a specific node, edge, etc. in the individual specifications in the data file. After the CFGConf JSON file is processed, the program generates a final graph in dot format. Hence, CFGConf supports all the rendering options supported by dot graphviz. In addition, it supports additional options related to visualizing CFGs.
Common style properties used here include:
-
shape
, which sets the node shape -
style
, which includes the fill and line styles -
fillcolor
, which sets the fill color -
fontcolor
, which sets the text color -
label
, which sets whether to useid
orfull
labels from the node definitions
This example also:
- sets the
loop
'sbackground
property totrue
, the default, which colors the spanning area behind all node in a loop in orange. Nesting is shown by darker orange colors. - sets the
function
'sboundary
property totrue
, which draws an enclosing rectangle around the function and labels it. This will also affect the layout, forcing the function nodes closer together.
Here is the resulting visualization:
The example shows the use of filtering options. Filtering is beneficial for focusing on a subgraph of interest, such as a particular loop nest, function, or selected nodes and their neighborhood.
Filtering is also important for large graphs (i.e. graphs with more than a thousand nodes) as the graph layout can take a long time to solve or possibly even fail in these scenarios. Thus, we recommend filtering to a manageable size.
{
"data": {
"graphFile":"singlefile.dot",
"structureFile":"singlefile.structures.json"
},
"rendering": {
"node": {
"label": "id"
}
},
"filtering": {
"isHopFilterOn":true,
"selectedNodes": ["B44"],
"maxHops":4,
"minNodes":10,
"isLoopFilterOn": true
}
}
The filtering
keyword contains all filtering options. Filtering is mainly done by distance, or hops as described below. Thus, the isHopFilterOn
key determines whether or not filtering is on.
The main logic of filtering is:
- Select nodes of interest/focus with the
selectedNodes
array. These comprise the initial subgraph. - If
isLoopFilterOn
is true, the nodes of the loops to which the selected nodes belong are added. - The subgraph is expanded to include any nodes reachable within
maxHops
of the nodes from the first two steps.
The minNodes
and maxNodes
keys can further tweak the size of the filtered graph, expanding or limiting the selection respectively when maxHops
retrieves to few or too many nodes.s
The resulting visualization for the above is shown below. Note this is the same graph as in the previous rendering options example, but filtered around node "B44", which is thus highlighted with a teal border.
We use the term boundary nodes to refer to the nodes which are just outside the filtered graph, e.g., connected to the nodes in the filtered graph but not inside. These nodes are shown as small ellipses for context. Multiple boundary nodes attached to a filtered node are depicted as stacked ellipses.
This example demonstrates the function collapsing feature. This feature heuristically alters the graph appearance, removing clutter induced by frequently called functions (e.g., utility functions) that may not be of interest to the analysis. They drawing is altered as follows:
- Groups of nodes belonging to identified functions are collapsed into a single node.
- Collapsed nodes are duplicated to remove tangles of edges.
- Collapsed nodes are drawn in a de-emphasized style.
The collapsingRules
key turns this feature on and allows adjusting the heuristics for collapsing functions.
{
"data": {
"graphFile":"singlefile.dot",
"structureFile":"singlefile.structures.json"
},
"rendering": {
"node": {
"label": "id"
},
"function": {
"collapsingRules": {
"isCollapsingEnabled": true,
"minIncomingEdges": 4,
"minOutgoingEdges": 0,
"alwaysCollapseList": [],
"neverCollapseList": [],
"maxCollapseSize": "25p"
}
}
}
}
Collapsing rules apply to the portion of the graph to be drawn, i.e., the subgraph determined by the filtering rules. Functions in this subgraph containing loops will not be collapsed.
The following options determine which functions to collapse:
-
minIncomingEdges
- functions with at least this many incoming (call) edges will be collapsed. -
minOutgoingEdges
- functions with at least this many outgoing (return) edges will be collapsed. -
maxCollapseSize
- functions of this size will not be collapsed. This can be specified as a number of nodes or a percentage of the drawn subgraph by appending ap
, e.g. a value of "5p" equals 5 percent of nodes in the visible graph. -
alwaysCollapseList
- functions that should always be collapsed regardless of the other rules -
neverCollapseList
- functions that should never be collapsed regardless of the other rules
Here is the result of the above specification.
To inspect the figure, right click on the image and select Open Image in New Tab. Compared to the figure in the quick start example setting common rendering options, the functions puts
and printf
are collapsed.
The collapsed nodes are drawn with dotted boundaries. The label for the node is the id of the function that is collapsed. Hovering over the collapsed node in the web app will result in the number of nodes belonging to this function being displayed.