Requirements, installation, and contribution guidelines can be found below. Our full usage and API documentation can be found at: corebreakout.readthedocs.io
corebreakout
is a Python package built around matterport/Mask_RCNN for the segmentation and depth-alignment of geological core sample images. It provides utilities and an API to enable the workflow depicted in the figure below, as well as a CoreColumn
data structure to manage and manipulate the resulting depth-registered image data:
We are currently using this package to enable research on Lithology Prediction of Slabbed Core Photos Using Machine Learning Models, and are working on getting a DOI for the project through the Journal of Open Source Software.
This package was developed on Linux (Ubuntu, PopOS), and has also been tested on OS X. It may work on other platforms, but we make no guarantees.
In addition to Python>=3.6
, the packages listed in requirements.txt are required. Notable exceptions to the list are:
1.3<=tensorflow-gpu<=1.14
(or possibly justtensorflow
)mrcnn
via submodule: matterport/Mask_RCNN
The TensorFlow requirement is not explicitly listed in requirements.txt
due to the ambiguity between tensorflow
and tensorflow-gpu
in versions <=1.14
. The latter is almost certainly required for training new models, although it may be possible to perform inference with saved models on CPU, and use of the CoreColumn
data structure does not require a GPU.
Note that TensorFlow GPU capabilities are implemented with CUDA, which requires a supported NVIDIA GPU.
Optionally, jupyter
is required to run demo and test notebooks, and pytest
is required to run unit tests. Both of these should be manually installed if you plan to modify or contribute to the package source code.
We also provide a script for extraction of top/base depths from core image text using pytesseract
. After installing the Tesseract OCR Engine on your machine, you can install the pytesseract
package with conda
or pip
.
$ git clone --recurse-submodules https://github.com/rgmyr/corebreakout.git
$ cd corebreakout
To make use of the provided dataset and model, or to train new a model starting from the pretrained COCO weights, you will need to download the assets.zip
folder from the v0.2 Release.
Unzip and place this folder in the root directory of the repository (its contents will be ignored by git
-- see the .gitignore
). If you would like to place it elsewhere, you should modify the paths in corebreakout/defaults.py to point to your preferred location.
The current version of assets/data
has JSON annotation files which include an imageData
field representing the associated images as strings. For now you can delete this field and reduce the size of the data with scripts/prune_imageData.py
:
$ python scripts/prune_imageData.py assets/
We recommend installing corebreakout
and its dependencies in an isolated environment, and further recommend the use of conda
. See Conda: Managing environments.
To create a new conda
environment called corebreakout-env
and activate it:
$ conda create -n corebreakout-env python=3.6 tensorflow-gpu=1.14
$ conda activate corebreakout-env
Note: If you want to try a CPU-only installation, then replace tensorflow-gpu
with tensorflow
. You may also lower the version number if you are on a machine with CUDA<10.0
(required for TensorFlow>=1.13
). See TensorFlow GPU requirements for more compatibility details.
Then install the rest of the required packages into the environment:
$ conda install --file requirements.txt
Finally, install mrcnn
and corebreakout
using pip
. Develop mode installation (-e
) is recommended (but not required) for corebreakout
, since many users will want to change some of the default parameters to suit their own data without having to reinstall afterward:
$ pip install ./Mask_RCNN
$ pip install -e .
Please refer to our readthedocs page for full documentation!
- Navigate to the repository's issue tab
- Search for existing related issues
- If necessary, create and submit a new issue
- Please see
CONTRIBUTING.md
and the Code of Conduct for how to contribute to the project
- Most
corebreakout
functionality not requiring trained model weights can be verified withpytest
:
$ cd <root_directory>
$ pytest .
- Model usage via the
CoreSegmenter
class can be verified by runningtests/notebooks/test_inference.ipynb
(requires saved model weights) - Plotting of
CoreColumn
s can be verified by runningtests/notebooks/test_plotting.ipynb