-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix segmentation fault when JSON serializing a PeriodIndex #47431
Merged
simonjayhawkins
merged 6 commits into
pandas-dev:main
from
roberthdevries:46683-fix-segfault-when-json-serializing-periodindex
Jun 22, 2022
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
04425d4
Fix segmentation fault when JSON serializing a PeriodIndex
roberthdevries bc6d32a
Fix cpplint issues
roberthdevries 60a0910
Add whatsnew entry
roberthdevries 02f6e81
Address review comment to annotate PeriodIndex as a class
roberthdevries 581bd4a
Reword sentence to clarify
roberthdevries 521dfeb
Add parameter to index
roberthdevries File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know how the memory management work here? I'm not entirely sure that is safe to reassign to
values
after decrementing. It may be happen-stance that this improves the odds of delaying garbage collection. Maybe we can just return array_values here directly?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In C there is no Python style reference counting. I must say that I assume that the Python object returned by
PyObject_CallMethod
has at least a reference count of 1, so that it does not get deallocated.The next line decrements the refcount of the original
values
object, which in this case will reduce it to 0, causing it to be freed immediately. This was the problem causing the segfault.This fix delays the destruction of the original
values
object and as thearray_values
object should have a refcount high enough not to be destroyed, this works.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also in case
array_values
would be equal to NULL, this if statement on line 252 (new situation) would not be executed if it would it would be returned directly.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep typicaly calls
PyObject_*
functions will give you ownership of the reference to an object, which is pretty similar to +1 on a refcount.Could be wrong but I think it gets freed on the next garbage collector run, not necessarily immediately when the count reaches zero. Would definitely be safer here to return rather than re-using the variable if we can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not for this PR, but this part of the thread re-ups my desire to move this logic out of C
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python documentation on reference counting is definitely a good resource. Worth a read:
https://docs.python.org/3/c-api/intro.html#reference-count-details
I think this code would be very difficult to port to Cython maybe not even worth it, but of course anything possible with time and effort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the reference count reaches zero, a python object gets freed (see Py_DECREF). Garbage collection is used to free up objects which are held in circular references that are no longer referenced from any other objects.
This was also the cause of the segmentation fault, as a freed object was being used (use after free type of bug).
There is nothing to be gained by returning immediately with respect to influencing the reference count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks for clarifying