- PLC registers reading
- Modbus message captures
- Data processing
- Interactive graphs and statistical analysis
- Invariant inference
- Business process mining
- Operating system: Unix-like environments, including Linux, Mac OS X, and Windows Subsystem for Linux (WSL)
- Python 3.8 and PIP 3
sudo apt update
sudo apt upgrade
sudo apt install python3.8
sudo apt install python3-pip
- Python3.8 libraries: pandas, matplotlib, numpy, ray, json, glob, modbus_tk, scipy
pip3 install -r requirements.txt
- Java JDK version 8 or higher.
sudo apt-get install openjdk-8-jdk
-
Gradle Build Tool : installation
-
perl 5
sudo apt install perl
- TShark - Wireshark 3.4.8
sudo apt install wireshark
To install from source
wget https://www.wireshark.org/download/src/wireshark-3.4.8.tar.xz -O /tmp/wireshark-3.0.0.tar.xz tar -xvf /tmp/wireshark-3.4.8.tar.xz cd /tmp/wireshark-3.0.0 sudo apt update && sudo apt dist-upgrade sudo apt install cmake libglib2.0-dev libgcrypt20-dev flex yacc bison byacc \ libpcap-dev qtbase5-dev libssh-dev libsystemd-dev qtmultimedia5-dev \ libqt5svg5-dev qttools5-dev cmake . make sudo make install
- Daikon 5.8.10 : installation
- Fluxicon Disco 3.2.4 : installation
Disco is not supported by Unix-like operating systems. The users can make use of Wine or Darling to install and run this software.
Execute the script main.py to generate the data logs of the PLCs registers
python3 main.py simTime samplingTime
-
simTime is the simulation time of the CPS model in seconds.
-
samplingTime is the sampling frequency in seconds.
Ray framework was used to get simultaneous data from the PLCs and to seamlessly scale to a distributed attack architecture (eg. Botnet) if needed. The output are JSON Files, with the following naming convention:
{name_of_the_PLC}-{ip_of_the_PLC}-{port_of_the_PLC}@{timestamp}.json
These files are saved in the folder historian/ contained in the main directory.
In parallel with main.py, Tshark has to be started.
To start capturing packets a capture interface has to be specified, Tshark will treat the first interface as the default interface and capture from it by default. In other words, tshark
aliases to tshark -i 1
To list all the interfaces available to Tshark and select another one
tshark -D
Run the capture
tshark -i 1 -w modbusPackets.pcap-ng
While running, the total number of captured packets will appear on the console. Tshark generates a pcap-ng files that contains all the information about the captured packets. Once the pcap-ng file is created it can be translated int a CSV file by running
tshark -r modbusPackets.pcap-ng -T fields -E occurrence=f -e m -e t -e s -e d -e p -e L -e Cus:modbus.func_code:0:R -e Cus:modbus.bitval:0:R -e Cus:text:0:R -e Cus:modbus.regval_uint16:0:R -e Cus:mbtcp.trans_id:0:R -e i
The goal of the data processing is to convert the resulted files from the information gathering into datasets acceptable by invariant detection and business process mining tools.
Executethe script convertoCSV.py by specifying an integer value of the variable numberofPLCs that indicates the number of PLCs controlling the CPS model.
Execute mergeDatasets.py to convert the JSON files to a CSV datasets.
The column hold the values of the registers for each PLC with the following naming convention {name_of_the_PLC}_{name_of_the_Register}
.
The outputs are two CSV files saved in the directories PLC_CSV and process-mining/data.
python3 convertoCSV.py numberofPLCs
python3 mergeDatasets.py
The file saved in process-mining/data is a timestamped dataset, it will be used for the business process mining.
The file saved in PLC_CSV is an enriched dataset with a partial bounded history of registers, and additional informations such as stable states, slope values of measurements and relative setpoints. This dataset will be used for the invariant detection.
Execute the script runChartPlots.py :
python3 runChartPlots.py var1 var2 .... varn
The outputs of this execution are run-sequence plots of the specified variables in function of the simulation time.
Execute the script histPlots_Stats.py :
python3 histPlots_Stats.py var
The outputs of this execution are a histogram and statistical informations of the variable var.
These informations include :
- The mean, median, standard deviation, the maximum and minimum values.
- Two tests are performed for the statistical distribution : Chi-squared test for uniformity and Shapiro-Wilk test for normality.
The invariant generation is done using the front-end tool of Daikon for CSV dataset. To install Daikon follow the guide.
Execute the bash script runDaikon.sh to generate the invariants.
./runDaikon.sh
This script offers a query system to target specific invariants and to specify conditional invariants.
The users have the possibility to insert a variable name in order to display the associated invariants.
The users can customize the splitter info file Daikon_Invariants/Inv_conditions.spinfo by specifying the conditions that Daikon should use to create conditional invariants.
Spinfo file example :
PPT_NAME aprogram.point:::POINT
VAR1 > VAR2
VAR1 == VAR3 && VAR1 != VAR4
The results of the invariant analysis will be saved in the location Daikon_Invariants/daikon_results.txt.
The conditional invariant will be saved in the location Daikon_Invariants/daikon_results_cond.txt.
This step relies on Disco to generate graphs representing the business process.
Disco takes as input a CSV file containing the exchanged messages between the PLCs of the CPS model and the values of the PLCs registers.
To create this CSV file we use a java program to convert the pcap files and the CSV dataset generated from the previous steps.
The first step is to compile our java program. Within the directory process-mining run the command:
./gradlew build
The second step is to convert the pcap file and the csv dataset into an admissible format by Disco:
./gradlew runMessages
./gradlew runReadings
The final step is to combine the resulting files in a single one to generate the business process graphs:
./gradlew Merge
The output files are saved in directory process-mining/data.
To generate the business process graphs:
Launch Disco > Open File > Select the file MergeEvents.csv > Define each column role > Click Start Import
- PLC registers captures (JSON) Extract the JSON files to the directory /historian.
- Timestamped Dataset register values (CSV) Place the CSV file in the directory process-mining/data.
- Dataset register values (CSV) Place the CSV file in the directory daikon/Daikon_Invariants.
- Network capture (CSV) Place the CSV file in the directory process-mining/data.
- Network capture (PCAPNG) Convert the pcap file to CSV by using the tshark commands.