Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envelope triggered NaN #3408

Closed
zonkmachine opened this issue Mar 6, 2017 · 13 comments
Closed

Envelope triggered NaN #3408

zonkmachine opened this issue Mar 6, 2017 · 13 comments
Assignees
Milestone

Comments

@zonkmachine
Copy link
Member

zonkmachine commented Mar 6, 2017

Edit: Brief intro

  • We produce NaN.
  • NaN is triggered in instrument envelopes. Specifically by changing release time while playing.
  • This may only be during the first loop.
  • NaN introduced here. 9e1cdd0
  • An uninitialized variable?

To reproduce.

  • Load test project:envelopeNaN.mmp.zip

  • Play until loop. Sound disappears? If yes, proof. If no, reload project and try again.


Verbose intro

The sound went out, NaN style, when I tweaked the decay on an instrument while the song was playing. I set out to see if I could pinpoint the source and I eventually managed to find a way to reproduce it reliably enought to be able to bisect this and the bug seem to have been introduced in 9e1cdd0

commit 9e1cdd0441bca0697f582be3b85d8e2d4c152c72
Author: Vesa <contact.diizy@nbl.fi>
Date:   Sun Aug 3 14:49:45 2014 +0300

    Start work on replacing/removing global locks

The test project is a simple melody with a bitinvader synt having the volume envelopes release knob controlled by an automation track. There is a reverb on the end for detection purposes as having a reverb seem to enormously aggravate the issue. The NaN is however introduced before the effect chain.
If you play this track the sound may go silent somewhere during the first loop but if it makes it to the second you're safe and need to reopen the project and repeat 'play' until it eventually crashes. For me it will go silent at least one time out of four. Since the issue seem to clear itself up after the first run, I suspect an uninitialised variable somewhere but so far I haven't managed to fix this. We divide with zero in the envelope which I set out to 'fix' in issue #3381 but fixing that doesn't affect this issue at all. I have only seen this with the release button and I think only on the first loop.
If you introduce an assert( ! isnanf( some buffer ) ); in the input of the FX chain you will see that the NaN is detected on either:

  • The first loop cycle on a project launched from the command line (or automatically from last session).
  • Immediately on pressing play on a project launched from the gui.

This is related to #1048 and #3313 . I think this one should be fixed first as it's introduced first in the sound chain.

Test project: envelopeNaN.mmp.zip

@zonkmachine zonkmachine added this to the 1.2.0 milestone Mar 6, 2017
@michaelgregorius
Copy link
Contributor

@zonkmachine, I am not able to reproduce the problem. This might be the case because I don't have the zita reverb.

However, you could try to trap the critical floating point exceptions to find out the location where the NaN is introduced. Here's some example code that I put together from various source on the net:

#include <fenv.h> // For feenableexcept
#include <execinfo.h> // For backtrace and backtrace_symbols_fd
#include <unistd.h> // For STDERR_FILENO
#include <csignal> // To register the signal handler

#include <iostream>


void signalHandler( int signum ) {
   
    std::cout << "Interrupt signal (" << signum << ") received.\n";
    
    // Get a back trace
    void *array[10];
    size_t size;

    // get void*'s for all entries on the stack
    size = backtrace(array, 10);
    
    backtrace_symbols_fd(array, size, STDERR_FILENO);

    // cleanup and close up stuff here  
    // terminate program  

    exit(signum);
}

void triggerFloatingPointException()
{
    float a = 1., b = 0.;
    float c = a/b;
}

int main(void) {
    // Enable exceptions for certain floating point results
    feenableexcept(FE_INVALID   | 
                   FE_DIVBYZERO | 
                   FE_OVERFLOW  | 
                   FE_UNDERFLOW);
    
    // Install the trap handler
    // register signal SIGINT and signal handler  
    signal(SIGFPE, signalHandler);
    
    // Trigger an exception
    triggerFloatingPointException();
    
    return 0;
}

I think it should be clear from the comments what happens where. If I understand correctly the settings for the exceptions are inherited by threads so you should enable them very early in LMMS' main function.

Also for the example above the backtrace only contained addresses on my machine but I guess it might suffice to simply set a breakpoint in the signal handler (haven't tried if this really works though).

Good luck! :)

@zonkmachine
Copy link
Member Author

Thanks for the debugging code!

@zonkmachine, I am not able to reproduce the problem. This might be the case because I don't have the zita reverb.

Oh, zita. A mistake that one. I used it as the other ones changed when bisecting. I'll change that to one we ship. Calf reverb and ReverbSC works fine for this! ;)

@zonkmachine
Copy link
Member Author

Oh, zita. A mistake that one ... I'll change that to one we ship.

Fixed!

@michaelgregorius
Copy link
Contributor

@zonkmachine, thanks for the updated file! I am now also able to reproduce the problem. I have applied the code that I have posted above to the main function of LMMS and got some results. As a first step I needed to fix some other places in the code that triggered floating point exceptions in a rather hackish way. However, this was only done to get to the heart of the problem.

The main problem is caused at the end of InstrumentSoundShaping::processAudioBuffer in the code of the following if clause:

if( m_envLfoParameters[Volume]->isUsed() )
{
	float volBuffer [frames];
	m_envLfoParameters[Volume]->fillLevel( volBuffer, envTotalFrames, envReleaseBegin, frames );

	for( fpp_t frame = 0; frame < frames; ++frame )
	{
		float vol_level = volBuffer[frame];
		vol_level = vol_level * vol_level;
		buffer[frame][0] = vol_level * buffer[frame][0];
		buffer[frame][1] = vol_level * buffer[frame][1];
	}
}

After the volume buffer volBuffer is allocated it is filled by the call to fillLevel in the next line. However, that method fills the complete buffer with the value -2.71678e+37 which in turn leads to problems when both these values are squared in the following line:

vol_level = vol_level * vol_level;

Multiplying these two large negative numbers will likely result in a large positive number that exceeds the maximum for float thus leading to NaNs.

At first I thought it was a problem that volBuffer in not initialized after allocation but the problem still persists if it is filled with zeroes right after the allocation. So the culprit should be m_envLfoParameters.

@zonkmachine
Copy link
Member Author

@zonkmachine, thanks for the updated file! I am now also able to reproduce the problem. I have applied the code that I have posted above to the main function of LMMS and got some results.

Me to, it's brutal! :D
Leads me directly to #3381 and the divide by 0 issue for a starter.

@zonkmachine
Copy link
Member Author

@michaelgregorius If you have time and energy to take this on I suggest you assign this issue to yourself and then I should probably close #3381. I put #3381 on the RC3 roadmap https://github.com/LMMS/lmms/projects/2 but it looks like it's going to take far more work than just assign some new minimum values to the knobs.

@michaelgregorius
Copy link
Contributor

@zonkmachine, I think this is a potentially never ending story because I assume that all of the plugins contained with LMMS' might also cause floating point exceptions. So while everything might be nice with one project and set of plugins another project with another set might cause problems.

What do you think about adding a CMake option that conditionally adds and compiles the code above so that LMMS developers just have to flip a switch if they want to hunt down potential problems? It could perhaps also be used to hunt down the elusive problem described in #1048.

@zonkmachine
Copy link
Member Author

@zonkmachine, I think this is a potentially never ending story because I assume that all of the plugins contained with LMMS' might also cause floating point exceptions. So while everything might be nice with one project and set of plugins another project with another set might cause problems.

Yes, but I'm focused on the envelope/soundshaping part here.

What do you think about adding a CMake option that conditionally adds and compiles the code above so that LMMS developers just have to flip a switch if they want to hunt down potential problems? It could perhaps also be used to hunt down the elusive problem described in #1048.

Yes to this. I had a similar plan with 28b30d6

@zonkmachine zonkmachine self-assigned this Mar 15, 2017
@PaulBatchelor
Copy link
Contributor

@zonkmachine hows this one coming? Is there anything I can do here to help out?

@zonkmachine
Copy link
Member Author

hows this one coming?

This is a complex one and we've fixed some related issues:
#3428
#3425

I think this can be bumped to 1.3 but I will probably not have time to work on this.

@zonkmachine zonkmachine removed their assignment Apr 11, 2017
@zonkmachine zonkmachine modified the milestones: 1.3.0, 1.2.0 Apr 12, 2017
@michaelgregorius
Copy link
Contributor

I have added the pull request #3687 to provide an option to debug floating point exception. Please refer to the pull request for more details.

@zonkmachine zonkmachine self-assigned this Jul 19, 2017
@PhysSong PhysSong modified the milestones: 1.2.0, 1.3.0 Aug 10, 2017
@PhysSong PhysSong self-assigned this Aug 10, 2017
@PhysSong
Copy link
Member

The problem is: void EnvelopeAndLfoParameters::updateSampleVars() have been thread unsafe since 9e1cdd0. It can be fixed by introducing a mutex to guarantee the thread safety.
Bring it back to 1.2.0 because the producing of NaN is a serious bug.

@PhysSong
Copy link
Member

Should be fixed via #3761.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants