Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design of program tokens in MIDILike tokenization #71

Closed
caenopy opened this issue Sep 2, 2023 · 4 comments
Closed

Design of program tokens in MIDILike tokenization #71

caenopy opened this issue Sep 2, 2023 · 4 comments
Labels
enhancement New feature or request stale Inactive since 30 days or more

Comments

@caenopy
Copy link
Contributor

caenopy commented Sep 2, 2023

Hi! This is related to the changes in #68 on the handling of program tokens with MIDILike tokenization.

With use_programs enabled, miditok currently adds program tokens before each pitch token. To be consistent with previous MIDI-like tokenizations in Oore et al. and Huang et al., program tokens should only be added where program change messages occur in the original MIDI file. The resulting encoding is much more compact and practical, e.g., for a single track MIDI file with a single specified instrument, there would only be one program token in the tokenized stream rather than however many notes are.

@Natooz Natooz added the enhancement New feature or request label Sep 4, 2023
@Natooz
Copy link
Owner

Natooz commented Sep 4, 2023

Hi,

This should be doable. But as we rely in MIDIToolkit, we do not have track of the "real" program change messages. Instead, we can add a ProgramChange token when a note is being played by an instrument other than the last one, which should be the same anyway. I can work on this tomorrow

Also, Program tokens for each note increase significantly the sequence length, but this is greatly mitigated with BPE, at a point where it isn't a big issue.

@Natooz
Copy link
Owner

Natooz commented Sep 7, 2023

Hi 👋,
The feature is added to main.
I'll wait some time, hoping to catch issues or feature requests, before releasing the next version, as this one doesn't fix anything major.
You can still try it by installing from git

@github-actions
Copy link

github-actions bot commented Oct 8, 2023

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Inactive since 30 days or more label Oct 8, 2023
@github-actions
Copy link

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale Inactive since 30 days or more
Projects
None yet
Development

No branches or pull requests

2 participants