-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: RNTuple ug-fixing offset array concatenation, adding filter_name #1285
fix: RNTuple ug-fixing offset array concatenation, adding filter_name #1285
Conversation
Thank you for catching this @giedrius2020! I think the issue could be solved in a simpler way by adjusting some of the logic in here. I'll think back at what my reasoning was for this piece. uproot5/src/uproot/models/RNTuple.py Lines 511 to 521 in 012df94
|
Could you try checking if changing the last few lines to if delta:
if tracker > 0:
res[tracker] -= cumsum
cumsum += numpy.sum(res[tracker:tracker_end])
tracker = tracker_end fixes the issue? |
perhaps best that we separate the api change to keys() from the bug fix? otherwise, perhaps beyond this PR, is there some test in uproot intended to check the physics correctness of reading back files? |
@ariostas , I tried your suggestion (while disabling my changes) and it did not help. The result is still the same, arrays for all cluster look like this:
While it should be like this:
|
Ah okay, thanks
Since the RNTuple stuff isn't stable yet, I wouldn't worry much about separating things
I've added some tests, but the issue was that I didn't have a test file that was large enough to have multiple clusters, so that's why I hadn't seen this bug |
can one control cluster size in rntuple writing? indeed, i guess one doesn't want a few 100 MB of test file around |
I'm not sure, but that would be nice. I'll look into it |
Co-authored-by: Andres Rios Tascon <ariostas@gmail.com>
Co-authored-by: Andres Rios Tascon <ariostas@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @giedrius2020, this looks great. The test that is failing is unrelated.
I'm going to try to generate a small RNTuple with multiple clusters so that we can add a test for this.
I added a new test file in scikit-hep/scikit-hep-testdata#159. Let's add a new test once that gets merged and released. |
That worked! I'm updating again to be sure that we get all of the 3.13 tests. |
Now this should include the pyodide-build, and I'll enable auto-merge because I'm sure it will pass. |
@all-contributors please add @giedrius2020 for code |
I've put up a pull request to add @giedrius2020! 🎉 |
* Fixed __len__ method * Added a few more useful methods * Use the right number in arrays method * Updated to match spec and did some cleanup * Fixed order of extra type information * Extract column summary flags * style: pre-commit fixes * Fixed conflict resolution * Fixed test * Switched to using enums * Fixed RNTuple anchor * Updated locator types * Removed UserMetadata envelope * Started implementing new real32 types * Updated sharded cluster to match spec * Removed user metadata from footer * Fixed ClusterSummaryReader * Fix cascadentuple * Introduced RNTupleField class * Added test for #1285 * Fixed test * Fix test (attempt 2) * Finalized first version of RNTupleField * Added tests for RNTupleField * Implemented iterate method --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Changes made in RNTuple model:
def keys()