generated from usnistgov/opensource-repo
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
7 changed files
with
1,328 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
/Datasets/CIFAR/ | ||
/Datasets/FashionMNIST/ | ||
/Datasets/MNIST/ | ||
/Datasets/Noise/ | ||
/Datasets/OFDM/ | ||
/experiment_results/ | ||
.idea | ||
/IQGAN_project_archived/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,76 @@ | ||
# opensource-repo | ||
This repository is the recommended template repository for NIST opensource contributions. | ||
# <u> **Software for Modeling OFDM Communication Signals with Generative Adversarial Networks** </u> | ||
|
||
This repository contains Python code to generate results for experiments on generative modeling of radio frequency (RF) communication signals, specifically synthetic Orthogonal-Frequency Division Multiplexing (OFDM) signals. This code implements two novel Generative adversarial network (GAN) | ||
models, a 1D and 2D convolutional model, named **PSK-GAN** and **STFT-GAN**, respectively, as well as the **WaveGAN** model architecture as | ||
a baseline for comparison. All three GAN models are trained on synthetic datasets over a range of OFDM parameters and conditions and evaluate | ||
their performance with simulated datasets. | ||
|
||
## Software Implementation | ||
The software enables automated testing of many model configurations across different datasets. Model creation and training is implemented | ||
using the Pytorch library. This repository contains files for initializing the experiment test runs (`main.py`), training of GAN models(`gan_train.py`), loading target distributions (`data_loading.py`), and evaluation(`gan_evaluation.py`) of generated distributions. The `/utils` directory contains | ||
supporting modules for target dataset creation, and model evaluation. The `models/` directory contains modules that create **PSK-GAN**, **STFT-GAN**, | ||
and **WaveGAN** architectures. | ||
|
||
Running `main.py` runs the default GAN configuration specified by the configuration dictionary `./experiment_resources/training_specs_dict.py`. | ||
Descriptions for the fields specified in `./experiment_resources/training_specs_dict.py` are located in | ||
`./experiment_resources/configuration_dictionary_description.csv`. Additionally, a set of model configurations can be run in an automated fashion | ||
by passing a configuration table (csv file) as an argument to the main python module (ex. `main.py --configs path_to_config_table.csv`). Column labels | ||
of a configuration table should correspond to desired keys in the GAN configuration dictionary that are to be changed across runs. | ||
|
||
The training and test target datasets used in this study were synthesized using the script | ||
`scripts/target_data_synth.py` and are provided in a separate gzip file, `target_distributions.tar.gz`. To execute experiments, first unzip this file and place its contents in a subdirectory named `Data/`. When running the models, experimental results are saved in `experiment_results/`, which is divided into sub-folders corresponding to each experiment. Each experiment folder contains sub-folders with results from three trials of each test configuration. Each test-trial folder contains saved GAN models, training metadata, as well as evaluations of the generated distributions. Previously-computed experimental results are saved in `experiment_results.tar.gz` | ||
|
||
## <u>Requirements</u> | ||
We use a `conda` virtual environment to manage the project library dependencies. | ||
Run the following commands to install requirements to a new conda environment: | ||
```setup | ||
conda create --name <env> --file .experiment_resources/requirements.txt | ||
conda activate <env> | ||
pip install -r .experiment_resources/pip_requirements.txt | ||
``` | ||
|
||
|
||
## <u>Running Experiments</u> | ||
This code executes three experiments: (1) a data complexity experiment, (2) a modulation order experiment, and (3) a fading channel experiment. In order to reproduce results from each of the three experiments, run | ||
```angular2html | ||
main.py --configs ./experiment_resources/test_configs_complexity_PSKGAN.csv | ||
main.py --configs ./experiment_resources/test_configs_complexity_WaveGAN.csv | ||
main.py --configs ./experiment_resources/test_configs_complexity_STFTGAN.csv | ||
main.py --configs ./experiment_resources/test_configs_modulation_STFTGAN.csv | ||
main.py --configs ./experiment_resources/test_configs_channel_STFTGAN.csv | ||
``` | ||
Aggregated plots across model runs are created using the script `./scripts/plotting_script.py`. | ||
|
||
## <u>Implementation Notes</u> | ||
Single process multi-GPU training is done using Pytorch's DataParallel method, in order to increase training speed. | ||
(Multi-process multi-GPU (DistributedDataParallel) is not compatible with the gradient penalty operation (autograd.grad) | ||
and is not recommended when using Wasserstein-GP loss). | ||
|
||
## <u>Authors</u> | ||
Jack Sklar (jack.sklar@nist.gov) and Adam Wunderlich (adam.wunderlich@nist.gov) \ | ||
Communications Technology Laboratory \ | ||
National Institute of Standards and Technology \ | ||
Boulder, Colorado | ||
|
||
## <u>Acknowledgements</u> | ||
The authors thank Ian Wilkins and Sumeet Batra for their contributions to an early version of this software. | ||
|
||
## <u>Licensing Statement</u> | ||
This software was developed by employees of the National Institute of Standards and Technology (NIST), an | ||
agency of the Federal Government and is being made available as a public service. Pursuant to title 17 United | ||
States Code Section 105, works of NIST employees are not subject to copyright protection in the United States. | ||
This software may be subject to foreign copyright. Permission in the United States and in foreign countries, | ||
to the extent that NIST may hold copyright, to use, copy, modify, create derivative works, and distribute this | ||
software and its documentation without fee is hereby granted on a non-exclusive basis, provided that this | ||
notice and disclaimer of warranty appears in all copies. | ||
|
||
THE SOFTWARE IS PROVIDED 'AS IS' WITHOUT ANY WARRANTY OF ANY KIND, EITHER EXPRESSED, IMPLIED, OR STATUTORY, | ||
INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY THAT THE SOFTWARE WILL CONFORM TO SPECIFICATIONS, ANY IMPLIED | ||
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND FREEDOM FROM INFRINGEMENT, AND ANY WARRANTY | ||
THAT THE DOCUMENTATION WILL CONFORM TO THE SOFTWARE, OR ANY WARRANTY THAT THE SOFTWARE WILL BE ERROR FREE. IN | ||
NO EVENT SHALL NIST BE LIABLE FOR ANY DAMAGES, INCLUDING, BUT NOT LIMITED TO, DIRECT, INDIRECT, SPECIAL OR | ||
CONSEQUENTIAL DAMAGES, ARISING OUT OF, RESULTING FROM, OR IN ANY WAY CONNECTED WITH THIS SOFTWARE, WHETHER OR NOT | ||
BASED UPON WARRANTY, CONTRACT, TORT, OR OTHERWISE, WHETHER OR NOT INJURY WAS SUSTAINED BY PERSONS OR PROPERTY OR | ||
OTHERWISE, AND WHETHER OR NOT LOSS WAS SUSTAINED FROM, OR AROSE OUT OF THE RESULTS OF, OR USE OF, THE SOFTWARE | ||
OR SERVICES PROVIDED HEREUNDER. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,274 @@ | ||
""" | ||
This module holds pytorch related data loading methods and data | ||
preprocessing methods related to target data representation. | ||
This file can also be imported as a module and contains the following functions: | ||
* unpack_complex - unpack complex waveform to two-channel real-valued waveform | ||
* pack_to_complex - pack two-channel real-valued waveform to complex waveform | ||
* scale_dataset - scale target distribution to range [-1, 1] | ||
* load_target_distribution - load target training distribution from files | ||
* TargetDataset - wrapper for dataset for ease of loading into pytorch framework | ||
* build_DataLoader - create PyTorch DataLoader object | ||
* get_latent_vectors - load batch of latent vectors for input to generator | ||
* stft_to_waveform - convert complex STFT representation to complex waveform | ||
* waveform_to_stft - convert complex waveform to complex STFT | ||
* pad_signal_to_power_of_2 - zero-pad waveform to next power of 2 | ||
* unpad_signal - remove zero-padding from waveform | ||
""" | ||
|
||
import h5py | ||
import json | ||
import torch | ||
import numpy as np | ||
from scipy import signal | ||
from numpy.fft import fftshift | ||
from sklearn import preprocessing | ||
from scipy.stats import truncnorm | ||
from torch.utils.data import Dataset | ||
|
||
|
||
def unpack_complex(iq_data): | ||
""" | ||
Convert complex 2D matrix to 3D matrix with 2 channels for real and imaginary dimensions | ||
:param iq_data: numpy complex matrix (2D) | ||
:return: numpy floating point matrix (3D) | ||
""" | ||
iq_real = iq_data.real | ||
iq_imaginary = iq_data.imag | ||
iq_real = np.expand_dims(iq_real, axis=1) # Make dataset 3-dimensional to work with framework | ||
iq_imaginary = np.expand_dims(iq_imaginary, axis=1) # Make dataset 3-dimensional to work with framework | ||
unpacked_data = np.concatenate((iq_real, iq_imaginary), 1) | ||
return unpacked_data | ||
|
||
|
||
def pack_to_complex(iq_data): | ||
""" | ||
convert 3D matrix with 2 channels for real and imaginary dimensions to 2D complex representation | ||
:param iq_data: numpy floating point matrix (3D) | ||
:return: numpy complex matrix (2D) | ||
""" | ||
num_dims = len(iq_data.shape) | ||
if num_dims == 2: | ||
complex_data = 1j * iq_data[:, 1] + iq_data[:, 0] | ||
elif num_dims == 3: | ||
complex_data = 1j * iq_data[:, 1, :] + iq_data[:, 0, :] | ||
else: | ||
complex_data = 1j * iq_data[:, 1, :, :] + iq_data[:, 0, :, :] | ||
return complex_data | ||
|
||
|
||
def scale_dataset(data, data_set, data_scaler): | ||
""" | ||
Scale target distribution's range to [-1, 1] with multiple scaling options | ||
:param data: Target distribution | ||
:param data_set: dataset name | ||
:param data_scaler: data-scaler setting | ||
:return: scaled target distribution | ||
""" | ||
if data_scaler == "activation_scaler": | ||
return data, None | ||
|
||
# Feature Based data scaling: | ||
if data_scaler.find("feature") != -1: | ||
print(f"feature Based Scaling: {data_scaler}") | ||
data_shape = data.shape | ||
data = data.reshape(data_shape[0], -1) | ||
transformer = preprocessing.MaxAbsScaler() if data_scaler == "feature_max_abs" \ | ||
else preprocessing.MinMaxScaler(feature_range=(-1, 1)) | ||
transformer = transformer.fit(data) | ||
data = transformer.transform(data) | ||
data = data.reshape(data_shape) | ||
return data, transformer | ||
|
||
# Global Dataset scaling: | ||
elif data_scaler.find("global") != -1: | ||
transformer = None | ||
with open(rf'./Datasets/{data_set}/scale_factors.json', 'r') as F: | ||
channel_scale_factors = json.loads(F.read()) | ||
channel_max = channel_scale_factors["max"] | ||
channel_min = channel_scale_factors["min"] | ||
if data_scaler == "global_min_max": | ||
feature_max, feature_min = 1, -1 | ||
data = (data - channel_min) / (channel_max - channel_min) | ||
data = data * (feature_max - feature_min) + feature_min | ||
else: | ||
data = data / np.max(np.abs([channel_max, channel_min])) | ||
return data, transformer | ||
|
||
|
||
def load_target_distribution(data_set, data_scaler, pad_signal, num_samples, stft, nperseg, fft_shift): | ||
""" | ||
Load in target distribution, scale data to [-1, 1], and unpack any labels from the data | ||
:param fft_shift: Shift STFT to be zero-frequency centered | ||
:param nperseg: STFT FFT window length | ||
:param stft: Convert complex waveform to STFT | ||
:param num_samples: Number of samples to load from the target distribution | ||
:param pad_signal: Length of zero padding target distribution waveforms | ||
:param data_set: Name of dataset | ||
:param data_scaler: Name of scaling function option | ||
:return: PyTorch tensors | ||
""" | ||
d_type = complex | ||
h5f = h5py.File(rf"./Datasets/{data_set}/train.h5", 'r') | ||
real_dataset = h5f['train'][:] | ||
print("Dataset_length: ", len(real_dataset)) | ||
h5f.close() | ||
data = np.array(real_dataset[:, 1:]).astype(d_type) | ||
class_labels = np.real(real_dataset[:, 0]).astype(np.int) | ||
|
||
if int(num_samples) > 64: | ||
data = data[:num_samples] | ||
class_labels = class_labels[:num_samples] | ||
|
||
input_length = len(data[0, :]) | ||
pad_length = None | ||
if pad_signal and not stft: | ||
# WaveGAN uses strides of 4 so waveforms are padded to be powers of 2 | ||
data, pad_length = pad_signal_to_power_of_2(data) | ||
input_length = pad_length + input_length | ||
if stft: | ||
data, pad_length = pad_signal_to_power_of_2(data) | ||
data, f, t = waveform_to_stft(data, 2, nperseg) | ||
if fft_shift: | ||
data = np.fft.fftshift(data, axes=(1,)) | ||
|
||
input_length = (nperseg, data.shape[-1]) | ||
data = data.reshape(data.shape[0], nperseg, -1) | ||
data = data.view(complex) | ||
data = unpack_complex(data).view(float) # Unpacking complex-representation to 2-channel representation | ||
|
||
data = np.expand_dims(data, axis=1) if len(data.shape) < 3 else data | ||
data, transformer = scale_dataset(data, data_set, data_scaler) | ||
data = torch.from_numpy(data).float() | ||
class_labels = torch.from_numpy(class_labels).float() | ||
return data, class_labels, input_length, pad_length, transformer | ||
|
||
|
||
class TargetDataset(Dataset): | ||
""" | ||
Wrapper for dataset that can be easily loaded and used for training through PyTorch's framework. | ||
Pairs a training example with its label in the format (training example, label) | ||
""" | ||
def __init__(self, data_set, data_scaler, pad_signal, num_samples, stft=False, nperseg=0, fft_shift=False, **kwargs): | ||
self.dataset, self.labels, self.input_length, self.pad_length, self.transformer = \ | ||
load_target_distribution(data_set, data_scaler, pad_signal, num_samples, stft, nperseg, fft_shift) | ||
|
||
def __len__(self): | ||
return self.dataset.shape[0] | ||
|
||
def __getitem__(self, idx): | ||
return self.dataset[idx], self.labels[idx] | ||
|
||
|
||
def build_DataLoader(dataset_specs): | ||
""" | ||
Creates new Dataset, Sampler, and DataLoader using train_specs_dict. data-factors are | ||
specified by dataset-specs dictionary | ||
:param dataset_specs: dictionary defining data-specific | ||
:return: DataLoader | ||
""" | ||
dataset = TargetDataset(**dataset_specs) | ||
sampler = None | ||
return dataset, sampler | ||
|
||
|
||
def get_latent_vectors(batch_size, latent_size, latent_type="gaussian", device="cuda:0"): | ||
""" | ||
Load latent space variables and fake labels used for Generator | ||
:param latent_type: Uniform or Gaussian latent distribution | ||
:param batch_size: length of batch | ||
:param latent_size: lantent space random seed variable dimension | ||
:param device: nvidia-device object | ||
:return: latent variable pytorch-tensor and fake class labels | ||
""" | ||
if latent_type == "gaussian": | ||
z = torch.randn(batch_size, latent_size, 1, device=device) | ||
elif latent_type == "uniform": | ||
z = torch.from_numpy(np.random.uniform(low=-1.0, high=1.0, size=(batch_size, latent_size, 1))).float().to(device) | ||
else: | ||
truncate = 1.0 | ||
lower_trunc_val = -1 * truncate | ||
z = [] # assume no correlation between multivariate dimensions | ||
for dim in range(latent_size): | ||
z.append(truncnorm.rvs(lower_trunc_val, truncate, size=batch_size)) | ||
z = np.transpose(z) | ||
z = torch.from_numpy(z).unsqueeze(2).float().to(device) | ||
return z | ||
|
||
|
||
def stft_to_waveform(dataset, fs=2, nperseg=64): | ||
""" | ||
Transform Short-Time-Fourier-Transform (STFT) representation to complex waveform | ||
:param dataset: STFT Dataset | ||
:param fs: Sampling frequency (Hz) | ||
:param nperseg: N-Per-Segment Window length | ||
:return: Complex waveform dataset | ||
""" | ||
waveform_dataset = [] | ||
print("Mapping STFT dataset to timeseries:", end=" ") | ||
for i, spectrogram in enumerate(dataset): | ||
if i % 10000 == 0: | ||
print(i) | ||
t, x = signal.istft(spectrogram, fs, nperseg=nperseg, noverlap=int(nperseg * 0.75), input_onesided=False) | ||
waveform_dataset.append(x) | ||
waveform_dataset = np.array(waveform_dataset, dtype=complex) | ||
return waveform_dataset | ||
|
||
|
||
def waveform_to_stft(dataset, fs=2, nperseg=64): | ||
""" | ||
Convert complex waveform representation to Transform Short-Time-Fourier-Transform (STFT) representation | ||
:param dataset: Complex waveform dataset | ||
:param fs: sampling frequency (Hz) | ||
:param nperseg: N-per-segment window length | ||
:return: STFT Dataset | ||
""" | ||
stft_dataset = [] | ||
print("Mapping timeseries dataset to stft") | ||
for i, x in enumerate(dataset): | ||
if i % 10000 == 0: | ||
print(i) | ||
f, t, spectrogram = signal.stft(x, fs=fs, nperseg=nperseg, noverlap=int(nperseg * 0.75), | ||
return_onesided=False, boundary="even") | ||
stft_dataset.append(spectrogram) | ||
stft_dataset = np.array(stft_dataset, dtype=complex) | ||
return stft_dataset, f, t | ||
|
||
|
||
def pad_signal_to_power_of_2(waveform_dataset): | ||
""" | ||
Add zero padding to signal to nearest power of 2 | ||
:param waveform_dataset: Target Distribution | ||
:return: zero-padded target distribution, zero-padding length | ||
""" | ||
waveform_length = waveform_dataset.shape[-1] | ||
d_type = complex | ||
found = False | ||
test_int = waveform_length | ||
next_power_of_2 = None | ||
while found is False: | ||
if test_int & (test_int - 1) == 0: | ||
found = True | ||
next_power_of_2 = test_int | ||
else: | ||
test_int += 1 | ||
pad_length = next_power_of_2 - waveform_length | ||
padding_array_1 = np.zeros((len(waveform_dataset), pad_length // 2)).astype(d_type) | ||
padding_array_2 = np.zeros((len(waveform_dataset), pad_length // 2)).astype(d_type) | ||
padding_array_1, padding_array_2 = padding_array_1 + 1e-8, padding_array_2 + 1e-8 | ||
waveform_dataset = np.hstack((padding_array_1, waveform_dataset, padding_array_2)) | ||
return waveform_dataset, pad_length | ||
|
||
|
||
def unpad_signal(waveform_dataset, pad_length): | ||
""" | ||
Remove zero-padding of signal | ||
:param waveform_dataset: zero-padded dataset | ||
:param pad_length: length of zero-padding | ||
:return: waveform dataset | ||
""" | ||
if pad_length > 0: | ||
waveform_dataset = waveform_dataset[:, :, pad_length // 2: - pad_length // 2] | ||
return waveform_dataset | ||
else: | ||
return waveform_dataset |
Oops, something went wrong.