Skip to content

Releases: DigitalGrainSize/SediNet

v1.3

24 Jul 19:43
Compare
Choose a tag to compare
v1.3 Pre-release
Pre-release

Major change to code and options. Also marks a version used to re-analyze SandSnap imagery

  1. fixed generator error by adding exception and print statement to get_data_generator_Nvars_siso_simo, get_data_generator_Nvars_miso_mimo, and get_data_generator_1image
  2. new optimised defaults.py values (image size = 768, batch_size = 4, shallow=True)
  3. added BASE_CAT and BASE_CONT options to defaults.py
  4. added image size, model depth, to output file names
  5. added CAT_DENSE_UNITS and CONT_DENSE_UNITS options to defaults.py
  6. added CONT_LOSS and CAT_LOSS options to defaults.py, with defaults from tensorflow_addons (conda env yml updated). loss for categorical models can now be focal (default) or categorical_crossentropy. Models for continuous variables can now be pinball (default)
  7. all global variables are now capitalized, for readability/etc
  8. general tidy up of code for readability
  9. fixed bug in categorical model training (expected ndarray, not list, for batch labels)
  10. fixed bug in categorical model plotting
  11. added LICENSE
  12. now can take multiple batch sizes and build an ensemble model. This generally results in higher accuracy but more models = more model training time
  13. response variables can be scaled using a robust scaler, or not. Use scale=True in a config file to use scaling
  14. now checks for estimating weights path in root and res_folder directory and, if present, uses it. This can be used to add batch size combinations sequentially
  15. optionally, training imagery is now augmented if DO_AUG=True (in the defaults file). This doubles the training set, by augmenting each image (random horizontal shift, followed by a vertical flip)
  16. file names shorter (number of variables enumerated rather than each listed)
  17. improved/rewritten README
  18. more consistent and descriptive filenaming convention
  19. simpler structure: train only does training (no prediction). Use predict to get train and test sets evaluation. This also allows defaulting to CPU for prediction, to avoid OOM errors that are more likely using GPU for prediction
  20. no separate config file for prediction. One config file for both training and prediction
  21. fixed many bugs, including one that was using 3-band greyscale imagery (doh!)
  22. uses an exponentially decreasing learning rate scheduler rather than adaptive (because validation loss can be erratic)
  23. uses depthwise separable 2d convolutions rather than trad 2d convs. see here
  24. variables in defaults.py based on consideration of accuracy across many datasets, both included and not included as pat of the SediNet package
  25. categorical models also have a shallow and false option
  26. predict_all.sh is a fully worked example of using the framework to predict on all continuous datasets
  27. simplified yml conda env, and a requirements.txt
  28. added sedinet_predict1image.py for making predictions on a single image
  29. added sedinet_predictfolder.py for making predictions on a folder of images

The most important changes area

  • depthwise separable convolution layers
  • exponentially decreasing learning rate scheduler
  • pinball loss for continuous variables
  • focal loss and "shallow=False" for categorical variables
  • training and prediction using model ensembles trained with up to 4 different batch sizes

v1.0 initial submission of SediNet paper to journal

30 Sep 20:26
270cbc8
Compare
Choose a tag to compare