-
Notifications
You must be signed in to change notification settings - Fork 576
Add a new %{^HOOK} hash, similar to %SIG, and add support for "require__before" and "require__after" hooks. #20637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
34e94e1
to
b10eca8
Compare
b10eca8
to
07c5d11
Compare
07c5d11
to
acdd910
Compare
I've always thought that the abuse of It's rather too late to change those now, but can we avoid adding any more? I.e. can this please be spelled |
FWIW, my original implementation worked like that. But it was more work and it felt unperlish to me so i switched to There is another advantage at the implementations level that adding support for validating a new special key in %SIG is nearly trivial, but adding a new var with new magic to front an internal state is rather more complex and invasive. I found it interesting how much simpler the code was when it was based on %SIG. So frankly I'd prefer that we don't use a caret var for this. If PSC decides we must then so be it, it can be done, but I don't think its as good an idea as it sounds at first. And I mean that literally, my unpushed original code for this used |
@leonerd one option might be to introduce a |
Ahyes; that sounds like a better forward plan; a new place to put hooks would be nice. In that case, since we'd want to move WARN and DIE out of the current place anyway, it probably makes little difference whether we add REQUIRE in to SIG for now anyway. Might as well continue on this PR as it is for now and move all of them later on. |
acdd910
to
0d0530c
Compare
I agree it would be cleaner to only have A
This, however, I have a severe allergic reaction to. Please don’t. Code which uses the new interface would not work on older perls, not because older perls lack capabilities to support the feature being used, but simply because the way to access the capability changed name. To put it another way: because fashion. Worse, dealing with these hooks is relevant inside some rather plumbing-level modules, where forcing churn is just going to impose more forced perl version obsolescence on the ecosystem. The same reasoning is going to apply in the future, of course, so adding a |
@ap: to you it is a category error to put REQUIRE with DIE and WARN in %SIG. But i dont agree with you really, I see a "signal" generated when we are about to compile more code to be compatible with the notion of a generic signal, as much as DIE or WARN do anyway. Remember that require is triggered invisibly by use, and internally by using of various functionality or variables. So when someone "uses" code, having a "signal" triggered at the start of that process isn't that weird to me. Certainly to me it is not so obviously wrong that it seems like a "bizarre design decision". To me it seems a design decision well within the general concept of a signal triggered when perl does something invisibly under the hood. Especially as the require hook really is a true /signal/, unlike DIE or WARN, in that it cannot affect the execution of the require other than to throw an exception, both DIE and WARN allow you to modify the behavior of the die/warn functions. So they arent just signal handlers but also mutators. The only reason you seem to be distinguishing DIE and WARN from REQUIRE is that DIE and WARN are part of an exception handling process. To me the thing that the %SIG keys have in common is that they can all be triggered automatically by perl or the system itself when things happen. Perl requires modules behind the scenes, just as it throws exceptions, and thus it seems to me that they have more in common than they differ, and in fact, that REQUIRE, not being a mutator, actually has more in common with other signal handlers than it does with DIE or WARN and that it is really those two which are exceptional. [1] With regard to %{^HOOK} or whatever we call it (if we go ahead with it), i was thinking we would just "tie" the DIE and WARN keys in %SIG to their equivalents in %{^HOOK}, not necessarily removing the keys from %SIG until there has been a very very long deprecation cycle. The only thing that might break back compat is someone using only %{^HOOK} instead of %SIG, but that is the same as someone using defer{}. It would also be possible to create a shim to make %{^HOOK} work properly on older perls if we wanted. But i take your point that introducing a %{^HOOK} variable may just end up making things worse not better, that was part of my reason for saying "the only thing worse than putting this in %SIG while WARN and DIE are there is to have a different hook not in SIG." The precedent has been established, no matter how explicable we find that precedent, and we should just stick with it. [1] the return behavior of the REQUIRE hook is a bit exceptional. but the actual location of the hook to me not so much. |
@ap: Consider this case. opening a scalar file buffer auto-requires PerlIO::scalar.
To me this feels very much like what perl level %SIGnals do. Tell us when something internal has happened that we might want to know about. |
0d0530c
to
72af433
Compare
So are there any objections to this moving forward? |
This feels warnocked. I would like to merge this for the next release. |
But all of the other %SIG entries do change the execution, just by being installed. For POSIX signals, installing a %SIG entry prevents the system default signal handler from being executed.
I don't think that require being used "behind the scenes" is really justification for it being handled differently from hooking into any other The original problem leading to this PR is that wrapping |
I'll add that while I'd like to see a more general solution for this type of hook, I think it's fine if there's an initial implementation that only works for require. But I'd like the shape of it to allow future extension, and I don't think |
Hi @haarg. I have outlined why i dont think a separate variable is a good idea. So while I respect your opinion, if that is the strongest point you have here, then I think we will have to just agree to disagree. IMO The only thing worse than having I actually do agree that it would have been better if Larry had put the DIE and WARN hooks somewhere else, and he had I would have put the REQUIRE hook in the same place, but that boat has sailed. Lets not make the problem worse. Also, I do agree that I would not be pushing REQUIRE at all if it was sane to override require and other core ops properly. But it isn't and it seems likely it wont be for some time, if ever. So I think we should go ahead with REQUIRE as is. |
I don't quite like it in |
@Leont yeah, I understand. I dont like a number of design decisions in Perl. :-) But once they are baked in we have to consider this from the point of view of "least worst" options, not as though we have a clear field. And I think that the least worst option here is just swallow our dislike of this. The other option opens up its own can of portability worms. Specifcally that code that currently exists and does:
wll continue to work as it used to: disabling all signal handlers, OS level and Perl level. If we introduce a new var that is no longer true, and it becomes yet another thing people need to deal with. So the question here isnt "is this the ideal way to do this", it is "what is the least sucky way to do this". And IMO the objections about it being in %SIG are a matter of taste, which is important, but it lacks technical foundation. |
I'm not sure what you mean. @haarg 's objections were entirely technical (and seem reasonable to me). |
Gotta agree that rather than messing more with %SIG, which should be posix signals, the optimal solution here would be a completely separate mechanism for hooking perl core function, which could then also include die and warn and permit future hooking on those without the issues %SIG replacements present, maybe allowing phasing that out in some far future. |
3dcd4b2
to
77e1926
Compare
77e1926
to
d87efd5
Compare
scope.c
Outdated
void | ||
Perl_mortal_destructor_x(pTHX_ DESTRUCTORFUNC_t f, void *p) { | ||
PERL_ARGS_ASSERT_MORTAL_DESTRUCTOR_X; | ||
mortal_destructor_sv((SV*)f,(SV *)p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will treat f
and p
as refcounted, which is probably a bad idea.
Just using a different vtable is probably the easiest solution, this should be the simpler case anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworked the code a little bit to use SVFUNC_t which requires a SV * argument instead of DESTRUCTORFUNC_t which supports a void * argument. Also the code you quoted is/was just broken, the function needed to be wrapped in a SVt_IV and the f argument needed to be converted into an IV with PTR2IV(). But once I did that it seems work as expected and passes the tests I added to XS-APITest.
I am not really sure what we win by creating a new vtable to handle this and it would cost us one of the magic chars. It seems to me with the adjustments I just made it is fine as it is.
Can you review once more please?
f882dcc
to
d848440
Compare
I generally like this. I'm not entirely certain about the interface - having two types of after hooks feels like it might indicate there's a better option. But that isn't a rejection. But is this something that should be experimental? |
I don't see how that would be advantageous. This doesn't get in the way and its not a core part of the language nor does it affect code that already exists. (Like most use of experimental features.)
There aren't two types of after hook. The is a post-before hook and an after hook. They aren't the same thing. I explained why we have both in previous post, sharing state with /just/ an after hook is error prone and unreliable. On the other hand, @ap requested we provide a separate after hook for people who don't want to use a before hook and just want to do something after each require. I just gave him what he requested. |
On Fri, Mar 17, 2023 at 06:37:57AM -0700, Yves Orton wrote:
@demerphq commented on this pull request.
> +#define SAVEMORTALSVFUNC_X(f,sv) \
+ mortal_svfunc_x((SVFUNC_t)(f), sv)
+
+#define SAVEMORTALDESTRUCTOR_SV(coderef,args) \
+ mortal_destructor_sv(coderef,args)
+
The match SAVEMORTALIZESV(), which where the naming comes from.
Yeah, but SAVEMORTALIZESV() still uses the save stack. It's action is to
mortalise the SV in leave_scope() on scope exit
So I don't think SAVE should be in the names.
…--
The Enterprise successfully ferries an alien VIP from one place to another
without serious incident.
-- Things That Never Happen in "Star Trek" #7
|
I want to use these modules in other tests, so changing the name makes sense.
and also SvREFCNT_dec_ret_NULL() which is used to implement SvREFCNT_dec_set_NULL(). The set_NULL() macro is intended to be used to replace code like this: if (sv) { SvREFCNT_dec_NN(sv); sv = NULL; } The function form just facilitates it, and can be used in situations where returning NULL after decrementing a refcount would be reduce code complexity.
https://github.com/Perl/perl5/issues shows the list of open issues, whereas https://github.com/Perl/perl5/issues/new/choose is where someone can create a new ticket.
The function SAVEDESTRUCTOR_X() (save_destructor_x) can be used to execute a C function at the end of the current psuedo-block. Prior to this patch there was no "mortal" equivalent that would execute at the end of the current statement. We offer a collection of functions which are intended to free SV's at either point in time, but only support callbacks at the end of the current pseudo-block. This patch adds two such functions, "mortal_destructor_sv" which can be used to trigger a perl code reference to execute at the end of the current statement, and "mortal_svfunc_x" which can be used to trigger an SVFUNC_t C function at the end of the current statement. Both functions differ from save_destructor_x() in that instead of supporting a void pointer argument they both require their argument to be some sort of SV pointer. The Perl callback function triggered by "mortal_destructor_sv" may be provided no arguments, a single argument or a list of arguments, depending on the type of argument provided to mortal_destructor_sv(): when the argument is a raw AV (with no SV ref wrapping it), then the contents of the AV are passed in as a list of arguments. When the argument is anything else but NULL, the argument is provided as a single argument, and when it is NULL the perl function is called with no arguments. Both functions are implemented on top of a mortal SV (unseen by the user) which has PERL_MAGIC_destruct magic associated with it, which triggers the destructor behavior when the SV is freed. Both functions are provided with macros to match the normal SAVExx() API, with MORTALDESTRUCTOR_SV() wrapping mortal_destructor_sv() and MORTALSVFUNC_X() wrapping mortal_svfunc_x(). The heart of this logic cribbed from Leon Timmermans' Variable-OnDestruct. See the code at: https://metacpan.org/dist/Variable-OnDestruct/source/lib/Variable/OnDestruct.xs#L6-17 I am very grateful to him for his help on this. Any errors or omissions in this code are my fault, not his.
This defines a new magic hash C<%{^HOOK}> which is intended to be used for hooking keywords. It is similar to %SIG in that the values it contains are validated on set, and it is not allowed to store something in C<%{^HOOK}> that isn't supposed to be there. Hooks are expected to be coderefs (people can use currying if they really want to put an object in there, the API is deliberately simple.) The C<%{^HOOK}> hash is documented to have keys of the form "${keyword}__${phase}" where $phase is either "before" or "after" and in this initial release two hooks are supported, "require__before" and "require__after": The C<require__before> hook is called before require is executed, including any @inc hooks that might be fired. It is called with the path of the file being required, just as would be stored in %INC. The hook may alter the filename by writing to $_[0] and it may return a coderef to be executed *after* the require has completed, otherwise the return is ignored. This coderef is also called with the path of the file which was required, and it will be called regardless as to whether the require (or its dependencies) die during execution. This mechanism makes it trivial and safe to share state between the initial hook and the coderef it returns. The C<require__after> hook is similar to the C<require__before> hook however except that it is called after the require completes (successfully or not), and its return is ignored always.
d848440
to
d0b1ad1
Compare
It isn't syntax, but it's still a part of the language. There are plenty of other experimental things that have had warnings but no feature flag, because they didn't interfere with any existing syntax.
Those are both "after" hooks, and I never said they were the same thing. I understand the need for carrying state between the hooks. Having both types feels a bit awkward, but I'm not sure there's a better option. |
I think that is a strong argument for renaming them before the next major release. |
Done already. I noted in reply to @leonerd's original request, which davem was replying to, but because he replied via email it was not logged in the right place. (This is just for the record, you and I discussed this already in IRC.) |
Historically we used to parse out the tests that we ran in t/harness from the MANIFEST file. At some point this changed and we started consulting the disk using globs. However because we do not use a recursive search over the t/ directory it is quite possible that a new directory of tests is added which actually never runs. In Perl#20637 (comment) Tony C noticed that I had added a new test file t/op/hook/require.t which is in a new subdirectory t/op/hook/ which was unknown to t/harness and thus not actually being run by make test_harness. (This patch does NOT add t/op/hoop to the list of directories to scan, I will do that in the PR.) I then took the time to add code to detect if any other test files are not being run, and it turns out that it is also the case for the new t/class/ directory of tests and it is also the case for the tests for test.pl itself, found in the t/test_pl directory. This patch adds logic to detect if this happens and make t/harness die if it finds a test file in the manifest which will not be detected by the custom rules for finding test files that is used in t/harness. It does not die if t/harness finds tests that are not in MANIFEST, that should be detected by a different test. The level of complexity in finding and deciding the tests files that we should run, and the differences between t/TEST and t/harness is fairly high. In the past Nicholas put some effort into unifying the logic, but it seems since then we have drifted apart. Even though t/harness uses t/TEST and the _tests_from_manifest() function, for some time now it has only used it to find which extensions to test, not which test files to run. I have *NOT* dug into whether t/TEST is also missing test files that are in manifest. That can happen in a follow up patch. Long term we should unify all of this logic so that t/TEST and t/harness run the same test files always, and that we will always detect discrepancies between the MANIFEST and the tests we are running. We do not for instance test that they test the same things. :-) :-(
Historically we used to parse out the tests that we ran in t/harness from the MANIFEST file. At some point this changed and we started consulting the disk using globs. However because we do not use a recursive search over the t/ directory it is quite possible that a new directory of tests is added which actually never runs. In Perl#20637 (comment) Tony C noticed that I had added a new test file t/op/hook/require.t which is in a new subdirectory t/op/hook/ which was unknown to t/harness and thus not actually being run by make test_harness. (This patch does NOT add t/op/hoop to the list of directories to scan, I will do that in the PR.) I then took the time to add code to detect if any other test files are not being run, and it turns out that it is also the case for the new t/class/ directory of tests and it is also the case for the tests for test.pl itself, found in the t/test_pl directory. This patch adds logic to detect if this happens and make t/harness die if it finds a test file in the manifest which will not be detected by the custom rules for finding test files that is used in t/harness. It does not die if t/harness finds tests that are not in MANIFEST, that should be detected by a different test. The level of complexity in finding and deciding the tests files that we should run, and the differences between t/TEST and t/harness is fairly high. In the past Nicholas put some effort into unifying the logic, but it seems since then we have drifted apart. Even though t/harness uses t/TEST and the _tests_from_manifest() function, for some time now it has only used it to find which extensions to test, not which test files to run. I have *NOT* dug into whether t/TEST is also missing test files that are in manifest. That can happen in a follow up patch. Long term we should unify all of this logic so that t/TEST and t/harness run the same test files always, and that we will always detect discrepancies between the MANIFEST and the tests we are running. We do not for instance test that they test the same things. :-) :-(
Historically we used to parse out the tests that we ran in t/harness from the MANIFEST file. At some point this changed and we started consulting the disk using globs. However because we do not use a recursive search over the t/ directory it is quite possible that a new directory of tests is added which actually never runs. In Perl#20637 (comment) Tony C noticed that I had added a new test file t/op/hook/require.t which is in a new subdirectory t/op/hook/ which was unknown to t/harness and thus not actually being run by make test_harness. (This patch does NOT add t/op/hoop to the list of directories to scan, I will do that in the PR.) I then took the time to add code to detect if any other test files are not being run, and it turns out that it is also the case for the new t/class/ directory of tests and it is also the case for the tests for test.pl itself, found in the t/test_pl directory. This patch adds logic to detect if this happens and make t/harness die if it finds a test file in the manifest which will not be detected by the custom rules for finding test files that is used in t/harness. It does not die if t/harness finds tests that are not in MANIFEST, that should be detected by a different test. The level of complexity in finding and deciding the tests files that we should run, and the differences between t/TEST and t/harness is fairly high. In the past Nicholas put some effort into unifying the logic, but it seems since then we have drifted apart. Even though t/harness uses t/TEST and the _tests_from_manifest() function, for some time now it has only used it to find which extensions to test, not which test files to run. I have *NOT* dug into whether t/TEST is also missing test files that are in manifest. That can happen in a follow up patch. Long term we should unify all of this logic so that t/TEST and t/harness run the same test files always, and that we will always detect discrepancies between the MANIFEST and the tests we are running. We do not for instance test that they test the same things. :-) :-(
This PR adds a new
%{^HOOK}
hash, similar to%SIG
, but for hooking keywords. The hash is magic in a similar way as%SIG
, in that it validates its contents on store and actually maps to internal variables which store the values instead. The first two members of the hash arerequire__before
andrequire__after
. Members are expected to be of the form$keyword . "__" . $phase
, where$keyword
is the keyword that executes the hook, and$phase
is when the hook is executed (Currently either "before" or "after"). The initial user visible implementation includes support for a hook which is executed before require, and a hook which is executed after require. Therequire__before
hook may return a callback that is also executed after the require, but can be constructed in such a way that it shares state with the before hook that executed. The before hooks execute before nearly any otherrequire
related logic, including checking if the required item was alreadyrequired
, and the after hooks executed after therequire
completed, regardless as to whether the require died or was successful.With the
required__before
hook we could fix Devel::TraceUse to not alter the stack when it is in use. It could also be used to detect circular dependencies, log all used modules, or track the memory and time taken to require a given module (with or without its dependencies).In order to implement this there are some nice goodies from an internal point of view.
mortal_destructor_sv()
andmortal_svfunc_x()
(and their macro wrappersSAVEMORTALDESTRUCTOR_SV()
andSAVEMORTALSVFUNC_X()
) can be used to execute a callback at the end of the current statement (either Perl or C callback respective). Also there is theSvREFCNT_dec_set_NULL()
macro which can be used to decrement anSV*
pointer and set it to NULL in a single statement, and a related functionSvREFCNT_dec_ret_NULL()
which decrements an SV's refcount and returns NULL, which is used by theSvREFCNT_dec_set_NULL()
macro but can also be helpful in cases where a function wants to do this to an argument and then return NULL. There are also various other minor improvements toregen/mg_vtable.pl
(trimming trailing whitespace on generated documentation and renaming of some confusingly named internal variables) and improvements to the documentation of the related functions inpod/perlguts.pod
.Everything has documentation, including the new utility functions and macros.
@haarg @ap @book