Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug when multithreading with macOS since 0.3.4 #23

Closed
Natooz opened this issue Jan 30, 2024 · 14 comments
Closed

Bug when multithreading with macOS since 0.3.4 #23

Natooz opened this issue Jan 30, 2024 · 14 comments

Comments

@Natooz
Copy link
Contributor

Natooz commented Jan 30, 2024

Hi there 👋,

I spotted a bug causing Python to crash (Fatal Python error: Segmentation fault) when running multithreading operations (pytest-xdist) with the latest version of symusic on macOS (intel).

You can find logs of the issue here Natooz/MidiTok#142
I don't know what can be causing the issue exactly, I didn't check what changes have been made to the code.
It seems to only happen on "What a Fool Believes.mid".
I was able to reproduce locally, even though only one worker out of 8 crashed.

@Yikai-Liao
Copy link
Owner

Sorry, but I don't find the log there. It seems all the tests have been passed.

@Natooz
Copy link
Contributor Author

Natooz commented Jan 31, 2024

Are the logs of the failed actions inaccessible ? https://github.com/Natooz/MidiTok/actions/runs/7700019008/job/20983922442
EDIT: my bad, I thought they stayed accessible.
Here are the the raw logs

Otherwise you can reproduce it easily on an intel Mac:

# Make sure to have loaded a virtual python environment

git clone https://github.com/Natooz/MidiTok && cd MidiTok
pip install ".[tests]"
pytest -n logical -v tests/test_tokenize.py::test_multitrack_midi_to_tokens_to_midi

@Yikai-Liao
Copy link
Owner

Yikai-Liao commented Jan 31, 2024

Well, I did find something wrong with multiprocessing. The __setstate__ function was wrong. I have fixed it, but it is not related to this issue (Nothing changed after fixing it).

I didn't change the parsing midi code from 0.3.3 to 0.3.4. But checkout to v0.3.3 just works. And I have tried installing symusic from source, which would give the same error.

And this bug only appears on Mac. Quite a strange bug.

@Natooz
Copy link
Contributor Author

Natooz commented Jan 31, 2024

I first though of an error with pytest itself or pytest-xdist as mentioned in other issues pytest-dev/pytest#3216 but even with pytest v7.4.4 the issue occurs

@Yikai-Liao
Copy link
Owner

I will test the changes from 0.3.3 to 0.3.4 one by one. Fortunately, there are only a few changes

@Yikai-Liao
Copy link
Owner

@Natooz Could you offer me a minimal reproduce example without pytest? The error infomation in pytest is kind of ambiguous.

@Natooz
Copy link
Contributor Author

Natooz commented Jan 31, 2024

Do you have an idea of how this could be achieved? The issue is that it only occurs with xdist, even with one worker.

Here are a few tests I just ran (no code modification):

pytest -n 8 -v tests/test_tokenize.py::test_multitrack_midi_to_tokens_to_midi

-n 1 --> 3 fails
-n 2 --> 6 fails
-n 8 --> 3 fail
-n 20 --> 2 fails

I tested while the test method only loads the MIDI, no error occur, so this doesn't comes from here.

It fails (1 worker out of 8) when the test method is simply:

def _test_tokenize(
    midi_path: str | Path,
    tok_params_set: tuple[str, dict[str, Any]],
) -> None:
    r"""
    Tokenize a MIDI file, decode it back and make sure it is identical to the ogi.

    The decoded MIDI should be identical to the original one after downsampling, and
    potentially notes deduplication.

    :param midi_path: path to the MIDI file to test.
    :param tok_params_set: tokenizer and its parameters to run.
    """
    # Reads the MIDI and add pedal messages to make sure there are some
    try:
        midi = Score(Path(midi_path))
    except MIDI_LOADING_EXCEPTION as e:
        pytest.skip(f"Error when loading {midi_path.name}: {e}")

    # Creates the tokenizer
    tokenization, params = tok_params_set
    tokenizer: miditok.MIDITokenizer = getattr(miditok, tokenization)(
        tokenizer_config=miditok.TokenizerConfig(**params)
    )
    str(tokenizer)  # shouldn't fail
    tokenizer.preprocess_midi(midi)

So the issue occurs when preprocessing the MIDI, that is converting its attributes (notes, pitch bends...) to SOA and SOA back to attributes which are then assigned to the Scoreobject.

**EDIT: the issue occurs without preprocessing the MIDI, but the tokenizer has to be loaded for it to happen.

I'm investigating further

@Natooz
Copy link
Contributor Author

Natooz commented Jan 31, 2024

With the test method above, the tests pass on my machine with certain numbers of workers: 2 workers the tests pass, 4 workers sometimes one fails sometimes all pass, 3 workers (3 fails), 5 workers sometimes one or two fail...

No issue with only one worker.
Could this come from a concurrent operation performed by two workers?

@Natooz
Copy link
Contributor Author

Natooz commented Jan 31, 2024

Minimal failing test method:

def _test_tokenize(
    midi_path: str | Path,
    tok_params_set: tuple[str, dict[str, Any]],
) -> None:
    r"""
    Tokenize a MIDI file, decode it back and make sure it is identical to the ogi.

    The decoded MIDI should be identical to the original one after downsampling, and
    potentially notes deduplication.

    :param midi_path: path to the MIDI file to test.
    :param tok_params_set: tokenizer and its parameters to run.
    """
    # Reads the MIDI and add pedal messages to make sure there are some
    try:
        midi = Score(Path(midi_path))
    except MIDI_LOADING_EXCEPTION as e:
        pytest.skip(f"Error when loading {midi_path.name}: {e}")

    # Creates the tokenizer
    tokenization, params = tok_params_set
    tokenizer: miditok.MIDITokenizer = getattr(miditok, tokenization)(
        tokenizer_config=miditok.TokenizerConfig(**params)
    )

@Yikai-Liao
Copy link
Owner

What does the worker means here?

If they just load and process midi files separately, I think they won't operate the same part of memory. Symusic doesn't use any global variables and static variables. Further, if they do operate the same part of memory, the bug would also appears on windows and linux.

And if the worker means a subprocess, which is used more common in python, there won't be any problem about "thread safe"🤔

Also, I don't change the code of SoA from 0.3.3 to 0.3.4.

@Yikai-Liao
Copy link
Owner

Do you have an idea of how this could be achieved? The issue is that it only occurs with xdist, even with one worker

Do you mean that this bug can't be reproduced in python's built-in multiprocessing and thread library?

I don't know the implementation of xdist.

@Natooz
Copy link
Contributor Author

Natooz commented Jan 31, 2024

I'm not sure of how xdist handles multiprocessing
Indeed this is a strange bug as it occurs only on macOS.

@Yikai-Liao
Copy link
Owner

Yikai-Liao commented Feb 2, 2024

After several tests, I did find what caused the problem. In minimidi, I tried to copy 4 bytes at one time instead of one by one, for a slight performance improvement. Well, maybe this caused some problems about accessing invalid memory.

Now this problem have been fixed. You could also try it locally to make sure there are no other bugs left. @Natooz

@Natooz
Copy link
Contributor Author

Natooz commented Feb 2, 2024

No more errors! 🙌
Thank you for help on this!

@Natooz Natooz closed this as completed Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants