Automated Generation of Consistent, Diverse and Structurally Realistic Graph Models

This page contains supplementary material for the measurements taken in our Sosym journal paper titled "Automated Generation of Consistent, Diverse andStructurally Realistic Graph Models".

Target domains

Stc: Yakindu Statecharts [88] is an industrial modeling environment. We used the metamodel extract of Figure 1 which captures the state hierarchy and transitions of statecharts (with 12 classes and 6 references). Moreover, we formalized 10 constraints as graph predicates that restrict the transitions based on the built-in validation of Yakindu. For real human models, we collected 304 statechart models with a size ranging from 90 to 110 objects. The models were created by undergraduate students as solutions for similar (but not identical) statechart modeling homework assignments.

Met: Eclipse Modeling Framework [79] is a widely used DSL environment. We used an effective metamodel of EMF (which consisted of 13 classes excluding s and 45 references), and formalized 3 additional constraints. For realistic models, we gathered the recently updated EMF models from GitHub by querying code using GitHub API (as of 31/07/2019). We selected models ranging from 30 to 500 objects in size, and filtered out models that are not manually created (e.g., derived from an XML schema) resulting in 198 human models.

Scm: Finally, we gathered and generated software configuration models of GitHub projects [24] representing the connection between the developers, commits and issues of a project within a time window. The metamodel is derived from the data model of [24] consisting of 6 relevant classes and 10 references. We also formalized 4 constraints regulating the commits. The human data is collected from GHTorrent [24], from repositories created between 1/1/2017 and 1/1/2018. We gathered the commits and issues and active users created within 8 months from the day of repository creation. Finally, we kept 70 models ranging from 30 to 200 objects.

Compared approaches and metrics

We compared different model generation approaches to evaluate how realistic are the generated models.

As a reference, a large set of real models were collected for each domain (denoted as Hum).
We generated consistent models with Alloy Analyzer [32, 80] (All), a well-known SAT solver-based model finder by using known mappings [13, 66]. We used the latest stable version of Alloy (v4.2) with the default background solver configuration [42]. For enabling statistical analysis, we added a random amount of extra true statements (as used in [70]) to prevent the solver from running deterministically.
We implemented a simple black-box search-based model generator, which (1) repeatedly calls a back-end model generator until a time limit (1h), and (2) it continuously maintains a population of the best N=100 models with respect to a target metric of interest (Comb). We instantiated this framework with two existing model generators:
1. We generated random models for each domain using EMF random instantiator [9] (Rand), which does not support WF constraints; thus, the models are not guaranteed to be consistent. To provide a fair comparison with other approaches, we implemented a feature in Rand to specify the root element of the model.
2. We generated consistent models with the VIATRA Solver [65, 67] (GS) using the latest version of the graph solver algorithm. Model generation was restarted from scratch when deriving a new model (i.e., the search space was not preserved).
We evaluated our realistic graph generator (Real) guided with the graph metrics OD, NA, MPC, NTD and VIO (see Table 1) using the appropriate distances (KS, MD, AD). Moreover, we used a combined metric Comb that measures all these metrics simultaneously, with or without the number of violations (Comb+V vs. Comb−V).

Generated Models

Generated Models are available to download from the repository

Please contact the authors if you would like to try to rerun the generation (e.g. for another domain).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automated Generation of Consistent, Diverse and Structurally Realistic Graph Models

Target domains

Compared approaches and metrics

Generated Models

Clone this wiki locally