Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gain peak #200

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Add gain peak #200

wants to merge 16 commits into from

Conversation

atamazian
Copy link
Contributor

Attempt to implement #181.

@iver56 iver56 self-requested a review June 10, 2022 07:33
@iver56
Copy link
Owner

iver56 commented Jun 10, 2022

This needs default parameters, so I can do my_transform = GainPeak() and start using it just like that

Also, I tried to illustrate some examples of gain curves (green) that I imagine this class could generate and apply in the future (when it is in a more finished state):

bilde

One thing to note from this illustration is that I imagine that it could use an offset, so that the peak can actually appear anywhere, even before the start or after the end of the audio

By the way, I'm curious, what was your motivation for starting to make this class? Are you working on some audio AI application that could benefit from this transform?

@atamazian
Copy link
Contributor Author

Judging my your pics, you suggest using min_gain = 1 in all cases, since the amplified signal (green) has the same level as original (blue) in the beginning. As for the offset - should it be selected randomly?

As for motivation - I think this class can be beneficial for some audio AI tasks like recognition of badly received signals (for example, signal you received has some louder parts, and some more quiet parts).

@iver56
Copy link
Owner

iver56 commented Jun 10, 2022

Yeah, take the gain curve with a grain of salt, especially the min gain part :P I guess min gain isn't the most important feature here, but the difference between min gain and max gain.

@iver56
Copy link
Owner

iver56 commented Jun 10, 2022

Yeah, the offset can be selected randomly

@atamazian
Copy link
Contributor Author

OK, I'll modify the code accordingly.

@iver56
Copy link
Owner

iver56 commented Jun 10, 2022

Sweet, thanks 👍

@atamazian
Copy link
Contributor Author

As for offset, I think that one can additionally use Shift augmentation if necessary. What do you think?

@iver56
Copy link
Owner

iver56 commented Jun 25, 2022

Good question! Let's consider the possibilities:

  1. Shift and then GainPeak. In this case the peak will always be centered, so that's maybe not so realistic, and a ML model may learn that any gain peaks always occur in the middle.
  2. GainPeak and then Shift. In this case the gain peak will always be applied to the same part of the input audio (the part that was in the middle before the shift). This is maybe also not so realistic, as in reality it's more like any part of the audio can have a peak.

So in order to offer more variability in the results, I still think it's better that the gain peak itself gets an offset, instead of relying on Shift

@atamazian
Copy link
Contributor Author

  1. Actually it won't be the case since peak position in GainPeak is chosen randomly (it's not always the center)
  2. Also won't be the case (see 1)

@atamazian
Copy link
Contributor Author

Can you reply? I'll modify my code more if necessary.

@iver56
Copy link
Owner

iver56 commented Mar 6, 2023

Hi :) Thanks for the effort so far, and thanks for the patience. I've been in crunch mode at work for the past few days.

I saw that there are at least these two things that I would like to check before merge:

  1. Peak position (offset) - it can be anywhere, even before or after the time of the sound
  2. Difference between min gain and max gain

I'll try to give it a look at some point in the coming days

@atamazian
Copy link
Contributor Author

Nice! And thanks for your reply.

@iver56
Copy link
Owner

iver56 commented Mar 13, 2023

Thanks for the patience. This is still on my TODO list 🙈

@iver56
Copy link
Owner

iver56 commented Mar 14, 2023

I had another look at this now, and gathered some thoughts. Here's what I imagine would be good to have in/for this transform:

  • The peak (offset) can be anywhere, even before start or after end of the given input audio, as I illustrated above
  • Various fade curves, not just linear. Fades in "decibel domain" are probably a good idea.
  • Variable fade-in and fade-out durations, independent of each other
  • Gain diff in decibels
  • It should be able to have regions of the sound with constant gain, e.g. before the fade in, at the peak (hold the peak gain for some time) or after the fade out
  • Demo (demo.py)
  • Documentation

I imagine that it would also be nice if it could invert its behavior, so it'll essentially be a gain dip instead of a gain peak. Maybe this can be achieved with the gain diff parameter.

At the moment there's quite a gap between what I imagine/desire and what is coded thus far in this pull request. Here's what I propose: I make a GainPeak some time later. When I start doing that, I'll close this PR and make my own branch where I cherry-pick your commits into it, so you'll be listed as a contributor, and then I'll try to implement all the features I suggested above.

@atamazian
Copy link
Contributor Author

OK, let's do as you suggest.

Repository owner deleted a comment from Soumya6Tiwari Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants