-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming Conv 2D and Batchnorm 2D implementations #89
Conversation
… yet. Static conv2d model in progress. Python utilities to convert conv2d layers from tf to rtneural. Implemented test for conv2d model but probably still some errors
…however. Lots of comment for debugging to be removed
…t. Modified json to read the layer size as the product of the last two dimensions
… option in Conv1DStateless. Test to ajust to deal with same padding in time dimension to align outputs
…rules. Non-templated seems to work for all combination of stride dilation & padding
Conv2D + Batchnorm 2D Implementation
Fix conflicts with base repo
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #89 +/- ##
==========================================
- Coverage 96.06% 95.55% -0.52%
==========================================
Files 36 42 +6
Lines 2872 3237 +365
==========================================
+ Hits 2759 3093 +334
- Misses 113 144 +31 ☔ View full report in Codecov by Sentry. |
Thanks for the PR, at a glance these changes look great! Since there's a lot of changes here, it's going to take me a minute to review them thoroughly, but I just wanted to let you know that I am looking at it. Supporting only an Eigen implementation is fine for now, but in order to stay compatible with the other backends, it would probably make sense to add some |
…ltiple definition of TestType: only set in util_tests.hpp now
Cool thanks! I've just added a few |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I had a couple of little comments, but things are looking great overall!
One thing that might be worth thinking about is the best way to document the input/output data layout for the 2D layers. The test case that you wrote is a good example, but I'm just trying to think of how to make it most obvious to the user. Probably the right answer is something like:
- A section in the README or other text file decribing the input/output layout
- Have the doc-strings for the relevant layers point to the README/text file
- Have a dedicated example for 2D networks
Anyway, the documentation stuff can be cleaned up after merging this PR if that's more convenient, just to avoid blocking ourselves.
Looks like the CI jobs are mostly passing as well... For the "Auto-Format" job, running clang-tidy
on your changes would probably fix that one, but no worries if you aren't able to. The code coverage stuff looks okay as well (most of the new code paths that aren't being tested are some of the safety checks on model_loader.h). so don't worry that those are still showing up red.
RTNeural/model_loader.h
Outdated
#if RTNEURAL_USE_EIGEN | ||
else if(type == "batchnorm2d") | ||
{ | ||
model->getNextInSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like this call is being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed, I though this function was incrementing something inside of model, but no it is const
.
RTNeural/model_loader.h
Outdated
const auto num_filters_out = l.at("num_filters_out").back().get<int>(); | ||
const bool valid_pad = l.at("padding").get<std::string>() == "valid"; | ||
|
||
model->getNextInSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also possibly unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed as well
RTNeural/model_loader.h
Outdated
{ | ||
if(type != "batchnorm2d") | ||
{ | ||
debug_print("Wrong layer type! Expected: BatchNorm", debug); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be "Expected: BatchNorm2D"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed in all debug print of this function
RTNeural/batchnorm/batchnorm_eigen.h
Outdated
@@ -21,6 +21,8 @@ class BatchNorm1DLayer final : public Layer<T> | |||
{ | |||
inVec = Eigen::Map<const Eigen::Matrix<T, Eigen::Dynamic, 1>, RTNeuralEigenAlignment>( | |||
input, Layer<T>::in_size, 1); | |||
|
|||
// TODO: Why not make outVec a map of out? Would avoid the copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good call! Feel free to change this and get rid of outVec
if you like, or I can get to it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also removed inVec
and making a local map as well.
RTNeural/conv2d/conv2d_eigen.tpp
Outdated
, dilation_rate(in_dilation_rate) | ||
, stride(in_stride) | ||
, num_features_out(Conv1DStateless<T>::computeNumFeaturesOut(in_num_features_in, in_kernel_size_feature, in_stride, in_valid_pad)) | ||
, receptive_field(1 + (in_kernel_size_time - 1) * in_dilation_rate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to put a link or something showing where this number comes from? That was super helpful in Conv1DStateless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
RTNeural/conv2d/conv2d_eigen.tpp
Outdated
{ | ||
bias = Eigen::Vector<T, Eigen::Dynamic>::Zero(num_filters_out); | ||
|
||
for(int i = 0; i < receptive_field; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use state.resize()
here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes done
RTNeural/conv2d/conv2d_eigen.tpp
Outdated
template <typename T> | ||
void Conv2D<T>::setWeights(const std::vector<std::vector<std::vector<std::vector<T>>>>& inWeights) | ||
{ | ||
conv1dLayers.resize(kernel_size_time, Conv1DStateless<T>(num_filters_in, num_features_in, num_filters_out, kernel_size_feature, stride, valid_pad)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to move this to the layer constructor? It's not really a strict requirement, but I've been trying to keep all of the layer's memory allocation restricted to just the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also did it for conv1dstateless
.
model.add(keras.layers.InputLayer(input_shape=(n_frames, n_features, 1))) | ||
model.add(keras.layers.Conv2D(2, (5, 5), dilation_rate=(2, 1), strides=(1, 1), padding='valid', kernel_initializer='random_normal', bias_initializer='random_normal', activation='relu')) | ||
model.add(keras.layers.Conv2D(3, (4, 3), dilation_rate=(1, 1), strides=(1, 2), padding='same', kernel_initializer='random_normal', bias_initializer='random_normal')) | ||
model.add(keras.layers.BatchNormalization(momentum=0.0, epsilon=0.01, beta_initializer='random_normal', gamma_initializer='glorot_uniform', moving_mean_initializer="random_normal", moving_variance_initializer="ones")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add another 2D BatchNorm layer, just to test the "affine" code path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's the other way around. Here the affine == true
is tested and the non affine path needs to be tested.
In python/conv.py
, the first BN is affine and the second is not (center=False, scale=False
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tests/conv2d_model.h
Outdated
{ | ||
model.reset(); | ||
|
||
TestType input alignas(RTNEURAL_DEFAULT_ALIGNMENT)[num_features_in]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a std::vector
here instead as num_features_in
is not const (seems to be the reason for the compilation error on windows CI)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now only have one processModel
function used by templated and non-templated models. num_features_in
is now a template argument of the function, so we can allocate input
with correct alignement on the stack. No need to use a std::vector
with a custom aligned allocator if size is known at compile time.
…nged batchnorm to batchnorm2d in debug_print, added files to cmakelists, remove copy in batchnorm_eigen by using a map, all memory allocations in constructors, use resize instead of of push back. Single processModel function in conv2d_model.h
Hi Jatin,
I have implemented two new layers, 2d streaming convolution and 2d batch normalization, for a project that I am currently working on for the Neural Audio Plugin Competition. These layers are designed specifically for neural networks that process frequency domain data or similar. When those layers are used, the model runs on every frame, which contains a certain number of frequency bins, instead of every individual sample.
At the moment, I have only implemented these layers using the Eigen backend.
Please find below some details of the implementation. Let me know what you think!
The chosen streaming approach works by performing all necessary calculations involving a frame as soon as it is acquired, and then storing the obtained results in the layer states. These states are utilized to generate the accurate output.
Implementation of Batchnorm 2D with Eigen. Works very similarly as batchnorm1d, but now the channels consist of more than 1 value. But the number of weights stays the same. Only working when
axis=-1
in TF/Keras.Minimal changes to original API:
in_size
andout_size
are still used but set asnum_filters_in * num_features_in
andnum_filters_out * num_features_out
respectively.Test coverage for new layers, similar to other tests with some minor modifications to deal with frame alignment issues with different kind of paddings.