-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Locale patch breaks dist/threads/t/kill.t by triggering attribute.pm XS version mismatch errors. #20155
Comments
Where I customarily test blead I am unable to reproduce these failures.
|
@jkeenan notice your Configure args are different.
notice I have "usethreads" not "useithreads". |
Unless you explicitly specify both, they are inherently the same: the default for |
@Tux ok, thanks. @khwilliamson I have narrowed this down a bit. If I add |
No difference here.
|
It seems it is possible that we load the attributes XS code without loading the .pm code first. This causes threads/t/kill.t to die with an error about XS version mismatch. Explicitly loading the attributes module from threads::shared seems to fix the problem. This resolves #20155.
FWIW, it seems that #20157 fixes the issue, although it is still not clear to me if it should be required or why your patch exposed this issue this way. It does seem plausible that the old locale code loaded attributes.pm implicitly and stuff just worked that now doesn't. But why it only fails here and not elsewhere, and etc, are all open questions. |
It seems it is possible that we load the attributes XS code without loading the .pm code first. This causes threads/t/kill.t to die with an error about XS version mismatch. Explicitly loading the attributes module from threads::shared seems to fix the problem. This resolves #20155.
Number parsing fails after a thread was created.. A reduced test case (which could be used as a starting point for an extra test):
Running it:
and
For those unaware:
|
One final observation: adding a
Running:
|
This is a bug only in boxes running Posix 2008 threaded locales, and has to do with the thread not being switched soon enough from the global locale. I had commits already in my pipeline to fix this; I'll try to move them up. In the meantime, just don't set LC_NUMERIC to a non-dot radix locale in your environment at perl initiation |
Do you have an estimate on that? Right now it just - silently - fails at basic math when there was a thread (and LC_NUMERIC is set to a non-dot radix)
and
Running:
|
The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. Prior to this commit, there was a bug in which, when a thread terminates, the master thread was switched into the global locale. That meant that that thread was no longer thread-safe with regards to locales. This bug stems from the fact that perl assumes that all you need to do to switch between threads (or embedded interpreters) is to change out aTHX. Indeed much effort was expended in crafting perl to make this the case. But it breaks down in the case of some alien library that keeps per-thread information. That library needs to be informed of the switch. In this case it is libc keeping per-thread locale information. We change the thread context, but the library still retains the old thread's locale. One cannot be using a given locale object and successfully free it. Therefore the code switches to the global locale (which isn't deletable) before freeing. There was no apparent need to do more switching, as the thread is in the process of dying. What I was unaware of is that it is the parent thread pretending to be the dying one for the purposes of destruction. So switching to the global locale affected the parent, leaving it there. The parent thread called the locale.c thread locale termination function, and then called the perl.c perl_destruct() on the thread. This commit moves all the code for thread destruction from perl.c into the locale.c code, and calls it. Thus the thread initiation and termination is moved into locale.c The thread termination is also called from thread.c. This cleans up a dying thread. The perl.c call is needed for thread0 and non-multiplicity builds. A check is done to prevent duplicate work. This commit adds a new per-interpreter variable which maps aTHX to its locale. This is used to get the terminating thread's locale instead of the master. And the master locale is switched back to at the end. This commit is incomplete. Something similar needs to be done for Windows where the libc knows the per-thread locale. I'm unsure of if this is the full correct approach. It only works for thread termination. Perhaps a better solution would be to change the locale every time aTHX is changed. PERL_SET_INTERP, PERL_SET_CONTEXT, and PERL_SET_THX all seem to do the aTHX change, and I can't figure out when you would prefer one over the other. But maybe one of them should then arrange also to change the locale when aTHX is changed. Perhaps you can think of other libraries and functions that have a similar problem that also would need something like this. This commit causes Perl#20155 to go away. The triggering failure is merely a symptom of the deeper problem. A proper test will need to be done in XS.
#20178 fixes this, but needs more work, and advice |
The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. Prior to this commit, there was a bug in which, when a thread terminates, the master thread was switched into the global locale. That meant that that thread was no longer thread-safe with regards to locales. This bug stems from the fact that perl assumes that all you need to do to switch between threads (or embedded interpreters) is to change out aTHX. Indeed much effort was expended in crafting perl to make this the case. But it breaks down in the case of some alien library that keeps per-thread information. That library needs to be informed of the switch. In this case it is libc keeping per-thread locale information. We change the thread context, but the library still retains the old thread's locale. One cannot be using a given locale object and successfully free it. Therefore the code switches to the global locale (which isn't deletable) before freeing. There was no apparent need to do more switching, as the thread is in the process of dying. What I was unaware of is that it is the parent thread pretending to be the dying one for the purposes of destruction. So switching to the global locale affected the parent, leaving it there. The parent thread called the locale.c thread locale termination function, and then called the perl.c perl_destruct() on the thread. This commit moves all the code for thread destruction from perl.c into the locale.c code, and calls it. Thus the thread initiation and termination is moved into locale.c The thread termination is also called from thread.c. This cleans up a dying thread. The perl.c call is needed for thread0 and non-multiplicity builds. A check is done to prevent duplicate work. This commit adds a new per-interpreter variable which maps aTHX to its locale. This is used to get the terminating thread's locale instead of the master. And the master locale is switched back to at the end. This commit is incomplete. Something similar needs to be done for Windows where the libc knows the per-thread locale. I'm unsure of if this is the full correct approach. It only works for thread termination. Perhaps a better solution would be to change the locale every time aTHX is changed. PERL_SET_INTERP, PERL_SET_CONTEXT, and PERL_SET_THX all seem to do the aTHX change, and I can't figure out when you would prefer one over the other. But maybe one of them should then arrange also to change the locale when aTHX is changed. Perhaps you can think of other libraries and functions that have a similar problem that also would need something like this. This commit causes Perl#20155 to go away. The triggering failure is merely a symptom of the deeper problem. A proper test will need to be done in XS.
@khwilliamson release of 5.37.4 is getting closer.. have you been able to make progress on this? (/on a better fix then PR #20178 ?) |
See Perl#20155 The root cause of that problem is that under POSIX 2008, when a thread terminates, it causes thread 0 (the controller) to change to the global locale. Commit a7ff7ac caused perl to pay attention to the environment variables in effect at startup for setting the global locale when using the POSIX 2008 locale API. (Previously only the initial per-thread locale was affected.) This causes problems when the initial setting was for a locale that uses a comma as the radix character, but the thread 0 is set to a locale that is expecting a dot as a radix character. Whenever another thread terminates, thread 0 was silently changed to using the global locake, and hence a comma. This caused parse errors. The real solution is to fix thread 0 to remain in its chosen locale. But that fix is not ready in time for 5.37.4, and it is deemed important to get something working for this monthly development release. This commit changes the initial global LC_NUMERIC locale to always be C, hence uses a dot radix. The vast majority of code is expecting a dot. This is not the ultimate fix, but it works around the immediate problem at hand. The test case is courtesy @bram-perl
See #20155 The root cause of that problem is that under POSIX 2008, when a thread terminates, it causes thread 0 (the controller) to change to the global locale. Commit a7ff7ac caused perl to pay attention to the environment variables in effect at startup for setting the global locale when using the POSIX 2008 locale API. (Previously only the initial per-thread locale was affected.) This causes problems when the initial setting was for a locale that uses a comma as the radix character, but the thread 0 is set to a locale that is expecting a dot as a radix character. Whenever another thread terminates, thread 0 was silently changed to using the global locake, and hence a comma. This caused parse errors. The real solution is to fix thread 0 to remain in its chosen locale. But that fix is not ready in time for 5.37.4, and it is deemed important to get something working for this monthly development release. This commit changes the initial global LC_NUMERIC locale to always be C, hence uses a dot radix. The vast majority of code is expecting a dot. This is not the ultimate fix, but it works around the immediate problem at hand. The test case is courtesy @bram-perl
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to. This catches anything missed by locale_term(). This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
This is a step in solving Perl#20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to. This catches anything missed by locale_term(). This fixes the symptoms associtated with Perl#20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be called with the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155, EXCEPT for Windows boxes. That comes in the next commit.
This is a step in solving Perl#20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with Perl#20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be called with the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing Perl#20155, EXCEPT for Windows boxes. That comes in the next commit.
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be called with the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155, EXCEPT for Windows boxes. That comes in the next commit.
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155.
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155.
This is a step in solving Perl#20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with Perl#20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing Perl#20155.
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155.
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155.
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155.
This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155.
See Perl#20155 The root cause of that problem is that under POSIX 2008, when a thread terminates, it causes thread 0 (the controller) to change to the global locale. Commit a7ff7ac caused perl to pay attention to the environment variables in effect at startup for setting the global locale when using the POSIX 2008 locale API. (Previously only the initial per-thread locale was affected.) This causes problems when the initial setting was for a locale that uses a comma as the radix character, but the thread 0 is set to a locale that is expecting a dot as a radix character. Whenever another thread terminates, thread 0 was silently changed to using the global locake, and hence a comma. This caused parse errors. The real solution is to fix thread 0 to remain in its chosen locale. But that fix is not ready in time for 5.37.4, and it is deemed important to get something working for this monthly development release. This commit changes the initial global LC_NUMERIC locale to always be C, hence uses a dot radix. The vast majority of code is expecting a dot. This is not the ultimate fix, but it works around the immediate problem at hand. The test case is courtesy @bram-perl
This is a step in solving Perl#20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with Perl#20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing Perl#20155.
This reverts commit 9e254b0. Date: Wed Apr 5 12:26:26 2023 -0600 This fixes GH #21040 The reverted commit caused failures in platforms using the musl library, notably Alpine Linux. I came up with a fix for that, which instead broke Windows. In looking at that I realized the original fix is incomplete, and that things are too precarious to try to fix so close to 5.38.0. For example, I spent hours, due to a %p format printing 0 for what turned out to be a non-NULL string pointer. I think it has to do do with the fact that the failing code is in the middle of transitioning between threads, and the printing got confused as a result. The reverted commit was part of a series fixing #20155 and #20231. But the earlier part of the series succeeded in fixing those, without that commit, so reverting it should not cause things to break as a result. This whole issue has to do with locales and threading. Those still don't play well together. I have a series of well over 200 commits that address this situation, for applying in early 5.39. My point is that we are a long way from solving these kinds of issues; and they don't come up that much in the field because they just don't get used. The reverted commit would help if it worked properly, but it's not the only thing wrong by a long shot.
This reverts commit 9e254b0. Date: Wed Apr 5 12:26:26 2023 -0600 This fixes GH #21040 The reverted commit caused failures in platforms using the musl library, notably Alpine Linux. I came up with a fix for that, which instead broke Windows. In looking at that I realized the original fix is incomplete, and that things are too precarious to try to fix so close to 5.38.0. For example, I spent hours, due to a %p format printing 0 for what turned out to be a non-NULL string pointer. I think it has to do do with the fact that the failing code is in the middle of transitioning between threads, and the printing got confused as a result. The reverted commit was part of a series fixing #20155 and #20231. But the earlier part of the series succeeded in fixing those, without that commit, so reverting it should not cause things to break as a result. This whole issue has to do with locales and threading. Those still don't play well together. I have a series of well over 200 commits that address this situation, for applying in early 5.39. My point is that we are a long way from solving these kinds of issues; and they don't come up that much in the field because they just don't get used. The reverted commit would help if it worked properly, but it's not the only thing wrong by a long shot.
This reverts commit 9e254b0. Date: Wed Apr 5 12:26:26 2023 -0600 This fixes GH Perl#21040 The reverted commit caused failures in platforms using the musl library, notably Alpine Linux. I came up with a fix for that, which instead broke Windows. In looking at that I realized the original fix is incomplete, and that things are too precarious to try to fix so close to 5.38.0. For example, I spent hours, due to a %p format printing 0 for what turned out to be a non-NULL string pointer. I think it has to do do with the fact that the failing code is in the middle of transitioning between threads, and the printing got confused as a result. The reverted commit was part of a series fixing Perl#20155 and Perl#20231. But the earlier part of the series succeeded in fixing those, without that commit, so reverting it should not cause things to break as a result. This whole issue has to do with locales and threading. Those still don't play well together. I have a series of well over 200 commits that address this situation, for applying in early 5.39. My point is that we are a long way from solving these kinds of issues; and they don't come up that much in the field because they just don't get used. The reverted commit would help if it worked properly, but it's not the only thing wrong by a long shot.
This reverts commit 9e254b0. Date: Wed Apr 5 12:26:26 2023 -0600 This fixes GH Perl#21040 The reverted commit caused failures in platforms using the musl library, notably Alpine Linux. I came up with a fix for that, which instead broke Windows. In looking at that I realized the original fix is incomplete, and that things are too precarious to try to fix so close to 5.38.0. For example, I spent hours, due to a %p format printing 0 for what turned out to be a non-NULL string pointer. I think it has to do do with the fact that the failing code is in the middle of transitioning between threads, and the printing got confused as a result. The reverted commit was part of a series fixing Perl#20155 and Perl#20231. But the earlier part of the series succeeded in fixing those, without that commit, so reverting it should not cause things to break as a result. This whole issue has to do with locales and threading. Those still don't play well together. I have a series of well over 200 commits that address this situation, for applying in early 5.39. My point is that we are a long way from solving these kinds of issues; and they don't come up that much in the field because they just don't get used. The reverted commit would help if it worked properly, but it's not the only thing wrong by a long shot.
This reverts commit 9e254b0. Date: Wed Apr 5 12:26:26 2023 -0600 This fixes GH Perl#21040 The reverted commit caused failures in platforms using the musl library, notably Alpine Linux. I came up with a fix for that, which instead broke Windows. In looking at that I realized the original fix is incomplete, and that things are too precarious to try to fix so close to 5.38.0. For example, I spent hours, due to a %p format printing 0 for what turned out to be a non-NULL string pointer. I think it has to do do with the fact that the failing code is in the middle of transitioning between threads, and the printing got confused as a result. The reverted commit was part of a series fixing Perl#20155 and Perl#20231. But the earlier part of the series succeeded in fixing those, without that commit, so reverting it should not cause things to break as a result. This whole issue has to do with locales and threading. Those still don't play well together. I have a series of well over 200 commits that address this situation, for applying in early 5.39. My point is that we are a long way from solving these kinds of issues; and they don't come up that much in the field because they just don't get used. The reverted commit would help if it worked properly, but it's not the only thing wrong by a long shot.
Description
Bisect tells me that ever since a7ff7ac I have been seeing test failures from dist/threads/t/kill.t:
attributes object version 0.35 does not match $attributes::VERSION 0 at lib/XSLoader.pm line 112. Compilation failed in require at lib/Thread/Queue.pm line 19. BEGIN failed--compilation aborted at lib/Thread/Queue.pm line 19. Compilation failed in require at dist/threads/t/kill.t line 31. BEGIN failed--compilation aborted at dist/threads/t/kill.t line 36.
This is the patch details:
FWIW, I looked at the patch but beyond maybe an off by one error overwriting memory somewhere it is not clear to me why this patch would lead to this error. It is also not clear why it would fail for me and not in CI. I would be happy to provide any additional details required to debug.
Steps to Reproduce
Expected behavior
The second output, kill.t should not fail.
Perl configuration
The text was updated successfully, but these errors were encountered: