Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch static variable to thread_local to fix underflow bug #33

Merged
merged 2 commits into from
May 1, 2024

Conversation

jhiemstrawisc
Copy link
Member

@jhiemstrawisc jhiemstrawisc commented Apr 26, 2024

Multiple threads appear to be running through read_callback with their own data, and they're fighting over the sentSoFar variable. Switching it from static to thread_local makes sure each thread has its own copy to use.

I understand less about how XRootD manages threads, but making this switch appears to fix what was otherwise a guaranteed crash.

Closes #32

Multiple threads appear to be running through `read_callback` with their own data, and
they're fighting over the `sentSoFar` variable. Switching it from static to thread_local
makes sure each thread has its own copy to use.
@jhiemstrawisc jhiemstrawisc requested a review from bbockelm April 26, 2024 14:48
Copy link
Collaborator

@bbockelm bbockelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will work. There's a whole project here to overhaul the libcurl interactions along the lines of what's done for https://github.com/PelicanPlatform/xrdcl-pelican ... but this is not the PR where it should be done.

@bbockelm
Copy link
Collaborator

@jhiemstrawisc and I discussed this a bit in person. While this PR's one-liner would fix the issue at hand, we noted that it still relies on the the callback to be invoked until completion to reset the thread-local sentSoFar back to 0 -- if there's a failure mid-transfer, then the variable will stay where it left off.

A better solution is to create a small struct that keeps both the buffer and the current offset in the buffer, then have that live on the heap in the function invoking curl_easy_perform. This way, all the transfer-related state is in the pointer passed to the callback instead of having a mixture of state in a thread-local and the provided reference. Since that's going to still be a minimal change -- probably 5-10 lines -- it seems prudent to do that small change instead of two small changes.

@jhiemstrawisc
Copy link
Member Author

@bbockelm This is important for pelican 7.8 (and should be patched into 7.7), so I'm going to merge without waiting for a second review. Testing PUTS works locally and I think I implemented things the way you outlined, but if you come back and something doesn't look right I'm happy to update.

@jhiemstrawisc jhiemstrawisc merged commit c829527 into PelicanPlatform:main May 1, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

S3 plugin causes SEGFAULT on multi-file PUT using Pelican client
2 participants