TFE-GNN: A Temporal Fusion Encoder Using Graph Neural Networks for Fine-grained Encrypted Traffic Classification
Official implementation of the WWW'23 research paper: TFE-GNN: A Temporal Fusion Encoder Using Graph Neural Networks for Fine-grained Encrypted Traffic Classification. [ACM]
🔥 [2024-12] Our latest work MH-Net was accepted by AAAI 2025, and the code will be released in the next few days. We hope our work can bring some novel insights to the community, empowering network traffic identification with graph representation learning.
🌟 [2024-01] CLE-TFE is now open source, which is an improved version of TFE-GNN.
# python==3.8
pip install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu113/torch_stable.html
pip install dgl==1.0.0+cu113 -f https://data.dgl.ai/wheels/cu113/repo.html
pip install scikit-learn
pip install scapy
Or you can also prepare your own datasets.
We use SplitCap to obtain bidirectional flows for ISCX-VPN, ISCX-NonVPN, ISCX-TOR, and ISCX-NonTOR datasets, respectively. Please refer to it.
You may encounter some pcap file format conversion problems, and we just provide a simple script pcapng2pcap.py to convert .pcapng to .pcap files.
Note: We only use TCP pcap files in our work.
For specific categorization of each dataset, please refer to folder CATE.
To facilitate subsequent processing, we extract the information of the .pcap file into the .npz file.
You may refer to config.py and customize your own .pcap path in DIR_PATH_DICT. Then, run the following commands to start converting.
# ISCX-VPN
python pcap2npy.py --dataset iscx-vpn
# ISCX-NonVPN
python pcap2npy.py --dataset iscx-nonvpn
# ISCX-TOR
python pcap2npy.py --dataset iscx-tor
# ISCX-NonTOR
python pcap2npy.py --dataset iscx-nontor
Before start constructing, you may refer to config.py and customize all your own file paths. Then, run the following commands to start constructing.
# ISCX-VPN
python preprocess.py --dataset iscx-vpn
# ISCX-NonVPN
python preprocess.py --dataset iscx-nonvpn
# ISCX-TOR
python preprocess.py --dataset iscx-tor
# ISCX-NonTOR
python preprocess.py --dataset iscx-nontor
This script will save the byte-level traffic graph of the specified dataset in the path you specify.
After pre-processing, run the following commands to start training.
# ISCX-VPN
python train.py --dataset iscx-vpn --cuda 0
# ISCX-NonVPN
python train.py --dataset iscx-nonvpn --cuda 0
# ISCX-TOR
python train.py --dataset iscx-tor --cuda 0
# ISCX-NonTOR
python train.py --dataset iscx-nontor --cuda 0
After training, run the following commands to start evaluation.
# ISCX-VPN
python test.py --dataset iscx-vpn --cuda 0
# ISCX-NonVPN
python test.py --dataset iscx-nonvpn --cuda 0
# ISCX-TOR
python test.py --dataset iscx-tor --cuda 0
# ISCX-NonTOR
python test.py --dataset iscx-nontor --cuda 0
-
remove() function in utils.py
- The location of the header in the packet may change, so check this when using other datasets.
There are some reasons for possible differences in reproduction results.
-
System Environments. (including GPU driver version, etc.) (Verified)
-
Data Partition.
- Since most of the current network traffic datasets do not have a unified way to partition the training dataset and test dataset, this may lead to differences in results, which is normal.
In addition to potential differences, we recommend adjusting the hyperparameters in your data and environment configuration to achieve optimal results.
@Inproceedings{TFE-GNN,
author={Haozhen Zhang and Le Yu and Xi Xiao* and Qing Li* and Francesco Mercaldo and Xiapu Luo and Qixu Liu},
year="2023",
title="TFE-GNN: A Temporal Fusion Encoder Using Graph Neural Networks for Fine-grained Encrypted Traffic Classification",
booktitle="The Web Conference",
pages="2066–2075",
}