CTCoreNet

Neural network for classifying rock clasts in computed tomography (CT) scans of sediment cores.

Getting started

Quickstart

Launch in Binder (Interactive jupyter lab environment in the cloud).

Installation

To help out with development, start by cloning this repo-url

git clone <repo-url>

Then I recommend using conda to install the dependencies. The conda virtual environment will also be created with Python and JupyterLab installed.

cd ctcorenet
conda env create --file=environment.yml --solver=libmamba

Activate the virtual environment first.

conda activate ctcorenet

Finally, double-check that the libraries have been installed.

conda list

Usage

Training data preparation

Images are manually labeled by drawing polygons around objects of interest, e.g. ice rafted debris (IRD) rock clasts. The polygons are drawn using the labelme GUI program which is launched from the command-line using:

labelme

Follow the single image annotation tutorial at https://github.com/wkentaro/labelme/tree/master/examples/tutorial to draw the polygons. The polygons are stored in a JSON file, and can be converted to image masks using:

labelme_json_to_dataset data/CT_data/CORE_42.json -o data/train/CORE_42

This will produce a folder named CORE_42 containing 4 files:

img.png
label.png
label_names.txt
label_viz.png

Running the neural network

The model can be trained using the following command:

python ctcorenet/ctcoreunet.py

This will load the image data stored in data/train, perform the training (minimize loss between img.png and label.png), and produce some outputs.

More advanced users can customize the training, e.g. to be more deterministic, running for only x epochs, train on a CUDA GPU using 16-bit precision, etc, like so:

python ctcorenet/ctcoreunet.py --deterministic=True --max_epochs=3 --accelerator=gpu --devices=1 --precision=16

More options to customize the training can be found by running python ctcorenet/ctcoreunet.py --help.

Reproducing the entire pipeline

To easily manage the whole machine learning workflow, this project uses the data version control (DVC) library which stores all the commands and input/intermediate/output data assets used. This makes it easy to reproduce the entire pipeline using a single command

dvc pull
dvc repro

This command will perform all the data preparation and model training steps. For more information, see https://dvc.org/doc/start/data-pipelines.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.dvc		.dvc
.github/workflows		.github/workflows
ctcorenet		ctcorenet
data		data
.dvcignore		.dvcignore
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
environment-linux-64.lock		environment-linux-64.lock
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CTCoreNet

Getting started

Quickstart

Installation

Usage

Training data preparation

Running the neural network

Reproducing the entire pipeline

About

Releases 1

Sponsor this project

Languages

License

weiji14/ctcorenet

Folders and files

Latest commit

History

Repository files navigation

CTCoreNet

Getting started

Quickstart

Installation

Usage

Training data preparation

Running the neural network

Reproducing the entire pipeline

About

Resources

License

Stars

Watchers

Forks

Releases 1

Sponsor this project

Languages