Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming Conv 2D and Batchnorm 2D implementations #89

Merged
merged 21 commits into from
Mar 1, 2023

Conversation

DamRsn
Copy link
Collaborator

@DamRsn DamRsn commented Feb 26, 2023

Hi Jatin,

I have implemented two new layers, 2d streaming convolution and 2d batch normalization, for a project that I am currently working on for the Neural Audio Plugin Competition. These layers are designed specifically for neural networks that process frequency domain data or similar. When those layers are used, the model runs on every frame, which contains a certain number of frequency bins, instead of every individual sample.

At the moment, I have only implemented these layers using the Eigen backend.

Please find below some details of the implementation. Let me know what you think!

  • Implementation of streaming 2D convolutions with Eigen (templated and non-templated)
    • In Tensorflow, the dimension should follow N x T x F x C (Batch, Time, Feature, Channel)
    • Support for "same" and "valid" padding (works the same way as tensorflow)
    • Support for stride on feature axis
    • Support for dilation on time axis

The chosen streaming approach works by performing all necessary calculations involving a frame as soon as it is acquired, and then storing the obtained results in the layer states. These states are utilized to generate the accurate output.

  • Implementation of Batchnorm 2D with Eigen. Works very similarly as batchnorm1d, but now the channels consist of more than 1 value. But the number of weights stays the same. Only working when axis=-1 in TF/Keras.

  • Minimal changes to original API:

    • in_size and out_size are still used but set as num_filters_in * num_features_in and num_filters_out * num_features_out respectively.
  • Test coverage for new layers, similar to other tests with some minor modifications to deal with frame alignment issues with different kind of paddings.

DamRsn and others added 17 commits January 29, 2023 12:01
… yet. Static conv2d model in progress. Python utilities to convert conv2d layers from tf to rtneural. Implemented test for conv2d model but probably still some errors
…however. Lots of comment for debugging to be removed
…t. Modified json to read the layer size as the product of the last two dimensions
… option in Conv1DStateless. Test to ajust to deal with same padding in time dimension to align outputs
…rules. Non-templated seems to work for all combination of stride dilation & padding
Conv2D + Batchnorm 2D Implementation
Fix conflicts with base repo
@codecov-commenter
Copy link

codecov-commenter commented Feb 26, 2023

Codecov Report

Attention: Patch coverage is 91.20879% with 32 lines in your changes missing coverage. Please review.

Project coverage is 95.55%. Comparing base (c521243) to head (5ece924).
Report is 49 commits behind head on main.

Files with missing lines Patch % Lines
RTNeural/model_loader.h 72.94% 23 Missing ⚠️
RTNeural/batchnorm/batchnorm2d_eigen.h 86.95% 3 Missing ⚠️
RTNeural/conv1d_stateless/conv1d_stateless_eigen.h 96.15% 2 Missing ⚠️
...Neural/conv1d_stateless/conv1d_stateless_eigen.tpp 94.28% 2 Missing ⚠️
RTNeural/ModelT.h 96.15% 1 Missing ⚠️
RTNeural/conv2d/conv2d_eigen.h 97.67% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #89      +/-   ##
==========================================
- Coverage   96.06%   95.55%   -0.52%     
==========================================
  Files          36       42       +6     
  Lines        2872     3237     +365     
==========================================
+ Hits         2759     3093     +334     
- Misses        113      144      +31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jatinchowdhury18
Copy link
Owner

jatinchowdhury18 commented Feb 26, 2023

Thanks for the PR, at a glance these changes look great! Since there's a lot of changes here, it's going to take me a minute to review them thoroughly, but I just wanted to let you know that I am looking at it.

Supporting only an Eigen implementation is fine for now, but in order to stay compatible with the other backends, it would probably make sense to add some #if RTNEURAL_USE_EIGEN gaurds to the "main" headers (e.g. batchnorm2d.h), and for the relevant tests. That should also help the CI jobs to pass.

…ltiple definition of TestType: only set in util_tests.hpp now
@DamRsn
Copy link
Collaborator Author

DamRsn commented Feb 27, 2023

Cool thanks!

I've just added a few #if RTNEURAL_USE_EIGEN so that other backend can compile

Copy link
Owner

@jatinchowdhury18 jatinchowdhury18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I had a couple of little comments, but things are looking great overall!

One thing that might be worth thinking about is the best way to document the input/output data layout for the 2D layers. The test case that you wrote is a good example, but I'm just trying to think of how to make it most obvious to the user. Probably the right answer is something like:

  • A section in the README or other text file decribing the input/output layout
  • Have the doc-strings for the relevant layers point to the README/text file
  • Have a dedicated example for 2D networks

Anyway, the documentation stuff can be cleaned up after merging this PR if that's more convenient, just to avoid blocking ourselves.

Looks like the CI jobs are mostly passing as well... For the "Auto-Format" job, running clang-tidy on your changes would probably fix that one, but no worries if you aren't able to. The code coverage stuff looks okay as well (most of the new code paths that aren't being tested are some of the safety checks on model_loader.h). so don't worry that those are still showing up red.

RTNeural/Model.h Show resolved Hide resolved
#if RTNEURAL_USE_EIGEN
else if(type == "batchnorm2d")
{
model->getNextInSize();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like this call is being used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, I though this function was incrementing something inside of model, but no it is const.

const auto num_filters_out = l.at("num_filters_out").back().get<int>();
const bool valid_pad = l.at("padding").get<std::string>() == "valid";

model->getNextInSize();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also possibly unused?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed as well

{
if(type != "batchnorm2d")
{
debug_print("Wrong layer type! Expected: BatchNorm", debug);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "Expected: BatchNorm2D"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in all debug print of this function

RTNeural/CMakeLists.txt Show resolved Hide resolved
@@ -21,6 +21,8 @@ class BatchNorm1DLayer final : public Layer<T>
{
inVec = Eigen::Map<const Eigen::Matrix<T, Eigen::Dynamic, 1>, RTNeuralEigenAlignment>(
input, Layer<T>::in_size, 1);

// TODO: Why not make outVec a map of out? Would avoid the copy
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good call! Feel free to change this and get rid of outVec if you like, or I can get to it later.

Copy link
Collaborator Author

@DamRsn DamRsn Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also removed inVec and making a local map as well.

, dilation_rate(in_dilation_rate)
, stride(in_stride)
, num_features_out(Conv1DStateless<T>::computeNumFeaturesOut(in_num_features_in, in_kernel_size_feature, in_stride, in_valid_pad))
, receptive_field(1 + (in_kernel_size_time - 1) * in_dilation_rate)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to put a link or something showing where this number comes from? That was super helpful in Conv1DStateless.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
bias = Eigen::Vector<T, Eigen::Dynamic>::Zero(num_filters_out);

for(int i = 0; i < receptive_field; i++)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to use state.resize() here instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes done

template <typename T>
void Conv2D<T>::setWeights(const std::vector<std::vector<std::vector<std::vector<T>>>>& inWeights)
{
conv1dLayers.resize(kernel_size_time, Conv1DStateless<T>(num_filters_in, num_features_in, num_filters_out, kernel_size_feature, stride, valid_pad));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to move this to the layer constructor? It's not really a strict requirement, but I've been trying to keep all of the layer's memory allocation restricted to just the constructor.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also did it for conv1dstateless.

model.add(keras.layers.InputLayer(input_shape=(n_frames, n_features, 1)))
model.add(keras.layers.Conv2D(2, (5, 5), dilation_rate=(2, 1), strides=(1, 1), padding='valid', kernel_initializer='random_normal', bias_initializer='random_normal', activation='relu'))
model.add(keras.layers.Conv2D(3, (4, 3), dilation_rate=(1, 1), strides=(1, 2), padding='same', kernel_initializer='random_normal', bias_initializer='random_normal'))
model.add(keras.layers.BatchNormalization(momentum=0.0, epsilon=0.01, beta_initializer='random_normal', gamma_initializer='glorot_uniform', moving_mean_initializer="random_normal", moving_variance_initializer="ones"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add another 2D BatchNorm layer, just to test the "affine" code path?

Copy link
Collaborator Author

@DamRsn DamRsn Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the other way around. Here the affine == true is tested and the non affine path needs to be tested.

In python/conv.py, the first BN is affine and the second is not (center=False, scale=False)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
model.reset();

TestType input alignas(RTNEURAL_DEFAULT_ALIGNMENT)[num_features_in];
Copy link
Collaborator Author

@DamRsn DamRsn Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a std::vector here instead as num_features_in is not const (seems to be the reason for the compilation error on windows CI)

Copy link
Collaborator Author

@DamRsn DamRsn Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now only have one processModel function used by templated and non-templated models. num_features_in is now a template argument of the function, so we can allocate input with correct alignement on the stack. No need to use a std::vector with a custom aligned allocator if size is known at compile time.

…nged batchnorm to batchnorm2d in debug_print, added files to cmakelists, remove copy in batchnorm_eigen by using a map, all memory allocations in constructors, use resize instead of of push back. Single processModel function in conv2d_model.h
@jatinchowdhury18 jatinchowdhury18 merged commit edc90c7 into jatinchowdhury18:main Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants