-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strip output extension data/metadata #85
Comments
I modified the code to do what I wanted above. Perhaps these can be made configurable? Since different extensions add different metadata inside cells, and push their own entries into the global metadata entry.
|
Do you want to submit a PR with your changes? Making those configurable would be an option. However I'm leaning more and more towards a whitelist model where we specify which fields to keep and drop all the rest. |
Since my post I added (to-remove) even more fields, and I'm sure other people using other extensions will have others, so it won't be very efficient maintenance-wise (and also surprise-wise if suddenly a field gets stripped that wasn't stripped out before - after someone updates nbstripout). So, yes, I totally agree with you, that a whitelist model is a much better way to go. Thank you. it also would be nice to switch to some faster json parser, as with many notebooks under git, used as a git filter it now noticeably slows things down (git status, git prompt, etc.). It is a totally different issue and the reason I mention it here that I'm experimenting with a much faster way of doing it with jq, except it comes with a bunch of dependencies. update: I have recoded nbstripout's core functionality using jq and it's about 10-20 times faster now, and I no longer experience slowing down when working with git. |
Perfect. I will close this issue then, as your whitelist ticket references this one already. Thank you. |
Would it be possible to have an option to strip out extension data as well?
Different users use different extensions and in different ways and currently nbstripout doesn't strip that data, causing conflicts or/and unnecessary commit noise.
Examples:
a) if I use ToC extension, but others don't, I end up with:
this is a "toc": entry. Perhaps there is a known set of jupyter core top-level entries that can be kept and then all the extension top-level entries removed during stripout?
b) If I use Collapse Headers extension - it adds a bunch of metadata noise:
Perhaps there can be an option to force to set metadata to {}? Which would solve this particular extension.
Thank you!
The text was updated successfully, but these errors were encountered: