Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: uproot 5.2.0 integration testing #930

Closed
wants to merge 16 commits into from
Closed

Conversation

lgray
Copy link
Collaborator

@lgray lgray commented Nov 17, 2023

No description provided.

@lgray
Copy link
Collaborator Author

lgray commented Nov 17, 2023

@lobis FYI - looks like things are OK here

@lobis
Copy link
Contributor

lobis commented Nov 17, 2023

@lobis FYI - looks like things are OK here

Glad to hear!

I see you are manually specifing the fsspec handler, which is okay. However this will be the default behaviour from 5.2.0 on so I'm not sure if it's worth it. For instance (I think) explicitly specifying the handler as it's done here would cause it to break when trying to open file-like objects since this is the only case that cannot be handled by fsspec. The logic automatically selecting the appropiate file handler wouldn't be triggered.

Perhaps it would make sense to just update the uproot version to the latest pre-release (currently 5.2.0rc1). I guess in this case you wouldn't be able to merge into main since this would involve generating a new release but atleast it would test the CI (in the pre-release there are other fsspec related tests that would be nice to test).

@lgray
Copy link
Collaborator Author

lgray commented Nov 17, 2023

@lobis that makes sense. I'll just leave this PR open so we can try different RCs as they come out, and make sure everything's good over here before you cut the full release.

@lgray
Copy link
Collaborator Author

lgray commented Nov 17, 2023

This uproot rc appears to break one of the nanoevents schemas (physlite):
https://github.com/CoffeaTeam/coffea/actions/runs/6909362267/job/18800554615?pr=930#step:10:1563

Is this to be expected?

@lgray
Copy link
Collaborator Author

lgray commented Nov 17, 2023

@jpivarski adding you here since something funny happened.

@jpivarski
Copy link
Member

Swapping out the low-level Source (how to supply bytes for later interpretation) shouldn't lead to an error like

FAILED tests/test_nanoevents_physlite.py::test_electron_track_links[False] - ValueError: grouping branches like 'MuonSpectrometerTrackParticlesAuxDyn.truthParticleLink' should not be read directly; instead read the subbranches:

    'MuonSpectrometerTrackParticlesAuxDyn.truthParticleLink.m_persKey', 'MuonSpectrometerTrackParticlesAuxDyn.truthParticleLink.m_persIndex'

which has something to do with the interpretation. Oddly enough, this is the data structure that Seth and I were working on with AwkwardForth last week, but that couldn't be relevant because it's in a draft PR. Also, AwkwardForth would only come up if this branch were not split. The error message above is saying that they're split in this file.

I'm also surprised by the text of the error message: it's saying that a branch containing no data, but with subbranches that do contain data, should not be read directly. What happens in plain Uproot if you attempt to do that is that Uproot reads all of the subbranches and puts them together in an Awkward RecordArray, NumPy dict, or Pandas DataFrame, depending on library (whatever the "grouping" data structure is). Maybe NanoEvents doesn't like that and leads users away from it, which is fine.

But how could something have changed? That part doesn't make sense. @lobis, is this something that you can reproduce if you run Coffea's tests, toggling between Uproot's main and main-fsspec git branches? If so, can you narrow down on which PR or commit caused it? It's enough of a problem that changes intended to only affect the lowest layer—getting bytes from remote sources—seem to be affecting something higher-level, such as this interpretation. That suggests a leaked abstraction somewhere.

@lobis
Copy link
Contributor

lobis commented Nov 20, 2023

@lobis, is this something that you can reproduce if you run Coffea's tests, toggling between Uproot's main and main-fsspec git branches? If so, can you narrow down on which PR or commit caused it? It's enough of a problem that changes intended to only affect the lowest layer—getting bytes from remote sources—seem to be affecting something higher-level, such as this interpretation. That suggests a leaked abstraction somewhere.

Update: This turned out to be a bug in uproot itself.

It is now fixed and the pre-release version 5.2.0rc2 will contain this fix.

pyproject.toml Outdated Show resolved Hide resolved
Co-authored-by: Luis Antonio Obis Aparicio <35803280+lobis@users.noreply.github.com>
@lgray
Copy link
Collaborator Author

lgray commented Nov 28, 2023

@lobis weird typing errors in uproot:

/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/uproot/reading.py:25: in <module>
    path: str | Path | IO | dict[str | Path | IO, str],
E   TypeError: unsupported operand type(s) for |: 'type' and 'type'

@lobis
Copy link
Contributor

lobis commented Nov 28, 2023

@lobis weird typing errors in uproot:

/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/uproot/reading.py:25: in <module>
    path: str | Path | IO | dict[str | Path | IO, str],
E   TypeError: unsupported operand type(s) for |: 'type' and 'type'

Yeah this is on me, I forgot to rebase the main-fsspec branch with main before publishing the pre-release... I don't know how the PR passed the tests though...

Anyway I fixed it and rewrote history to keep it clean. Sorry @jpivarski but looks like you'll have to publish another pre-release from main-fsspec 🙏🏻 (already increased version).

pyproject.toml Outdated Show resolved Hide resolved
Co-authored-by: Luis Antonio Obis Aparicio <35803280+lobis@users.noreply.github.com>
@lgray lgray changed the title feat: fsspec handler for uproot in nanoevents build: uproot 5.2.0 integration testing Nov 28, 2023
@lgray
Copy link
Collaborator Author

lgray commented Nov 28, 2023

@lobis looks good now, thank you!

Let me know when new RCs come out for uproot and I'll bump the version in this integration branch.

@lobis
Copy link
Contributor

lobis commented Nov 28, 2023

@lobis looks good now, thank you!

Let me know when new RCs come out for uproot and I'll bump the version in this integration branch.

I'll make commit suggestions and tag you if any further pre-releases are published!

@lgray
Copy link
Collaborator Author

lgray commented Dec 6, 2023

closing this since it's now in #949 as a re-pin to release candidates.

@lgray lgray closed this Dec 6, 2023
@lgray lgray deleted the switch-fsspec-uproot branch December 14, 2023 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants