You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-ref() PP keyword has extremely high usage. Greping my blead repo shows:
Searched "ref(" 4347 hits in 605 files of 5879 searched
-The strings keyword ref() returns are part of the Perl 5 BNF grammer.
This is not up for debate. Changing their spelling or lowercasing them
is not for debate, or i18n-ing them dynamically realtime against
glibc.so's current OS process global locale is not up for debate or
wiring, or wiring inotify/kqueue into the runloop to monitor /etc or /var
so this race condition works as designed in a unit test:
$perl -E "dire('hello')"
Routine indéfinie &cœur::dire aufgerufen bei -e Zeile 1
-sv_reftype() and sv_ref() have very badly designed prototypes, and the
first time a new Perl in C dev reads their source code, they will think
these 2 will cause infinite C stack recursion and a SEGV. Probably most
automated C code analytic tools will complain these 2 functions do
infinite recursion too.
-The 2 functions don't return a string length, forcing all callers to
execute a libc strlen() call on a string, that could be 8 bytes, or 80 MB.
-The 2 functions don't split, parse, cat, or glue multiple strings to
create their output. All null term-ed strings that they return, are
already sitting in virtual address space. Either const HW RO, or
RCed HEK*s from the PL_strtab pool, that were found inside something
similar to a GV*/HV*/HE*/CV*/AV*/GP*/OP*/SV* in a OP*(no threads).
-COW 255 buffers from Newx() under 9 chars can't COW currently by policy.
CODE is 4, SCALAR is 6. HASH is 4. ARRAY is 5. But very short SV HEK* COWs
will COW propagate without problems.
-PP code "if(ref($self) eq 'HASH') {}" should never involve all 3-4 calls
Newx()/Realloc()/strlen()/memcpy().
So this fix all of this, and make pp_ref()/PP KW ref() be closer in speed
to C/C++/Asm style object type checking, which is almost always going to
be 1 or 2 or 3 ptr equality tests against C constant &sum_vtbl_sum_class,
or in Microsoft ecosystem SW, its a equality test of a 16 byte GUID in
memory, against a 16 byte SSE literal stored in a SSE opcode (TLDR ver).
Just convert backends sv_ref()/sv_reftype() to HEK* retvals, and convert
the front end pp_*() ops to fetch HEK*s and return SV*s with
POK_on SvPVX()== HEK*. In all likely hood, if right side of PP code is
"if (ref($self) eq 'HASH') {}", during the execution of
memcpy(pv1, pv2, len) as part of pp_eq, pv1 and pv2 are the same mem addr.
But I didn't single step eq operator to verify that yet.
-inside PP(pp_reftype) previously the branch sv_setsv(TARG, &PL_sv_undef);
did not fire SMG, after this commit it does, IDK why it wasnt firing
before, or consequences of SMG firing now on sv_set_undef(rsv); path.
-I suspect "sv_setsv(TARG, &PL_sv_undef);" and "sv_set_undef(rsv);" are
not perfect behavior copies of each other, in extreme/bizzare/user error
and bad CPAN XS code situtations but I haven't found any side effects of
the switch from sv_setsv(TARG, &PL_sv_undef); to sv_set_undef(rsv)
Untested typothetical cases like
sv_setsv(gv_star, &PL_sv_undef); sv_setsv(hv_star, &PL_sv_undef);
sv_setsv(svt_regexp_star, &PL_sv_undef);
sv_setsv(svt_invlist_star, &PL_sv_undef);
sv_setsv(svt_object_star, &PL_sv_undef);
sv_setsv(svt_io_star, &PL_sv_undef);
-sv_sethek() has a severe pathologic performance problem, if args
SV* dsv and HEK* src_hek, test true for
if(SvPVX(dsv) == HEK_KEY(src_hek)) {}.
But its still better than a strlen()/Newx()/memcpy()/push_save_stack()/
delayed_Safefree(); cycle. Any fix for this would be for the future.
-these 2 functions are experimental for now, hence undocumented and not
public API, if they are made public, arg "const int ob" should be removed
because of its confusing faux-infinite recursion but not real life
infinite recursion. The fuctions are exported so P5P hackers and
CPAN XS devs (unsanctioned by P5P) can benchmark and research these 2 new
functions using Inline::C/EU::PXS.
-future improvements not done here, make sv_reftype() and sv_ref() wrappers
around their HEK* counterparts. Note the HEK* must be RC++ed and stuffed
in a new SV*, or a PAD TARG SV*, before the rpp_replace_1_1_NN(TARG); call
because in artificial situations/fuzzing, strange things can happen during
a SvREFCNT_dec_NN(); call, and the HEK* sitting in a C auto might
get freed during the SvREFCNT_dec_NN();
-another improvement, sv_sethek(rsv, hek); is somewhat heavy, and doesn't
have a shortcut, to RC-- an existing SVPV HEK* COW itself, instead it
uses SV_THINKFIRST_***() and sv_force_normal***() to RC-- an existing
SVPV HEK* COW. If the SV* PAD TARG, is being used over and over by ref()
opcode, its always going to have a stale HEK* SVPVX() that needs to be
RC--ed.
-another improvement, check if(sv_reftypehek() == SvPVX(targ)) before
calling sv_sethek(rsv, hek);
-another improvement, beyond scope for me, make into 1 OP*/opcode:
if(ref($self) eq 'HASH')
and
if(ref($self) eq 'ARRAY')
-another improvement, dont deref my_perl->Iop/PL_ptr many times in a row.
I didn't do any CPU opcode/instruction stripping in this commit. Thats
for a future commit.
-another improvement, investigate if most of large switch() inside
Perl_sv_reftypehek() can be turned into a
const I8 arr_of_PL_sv_consts_idxs[]; with a couple tiny special cases.
-todo invert "if (!rsv) {" branch, so hot path (yes cached in PL_sv_consts).
comes first in machine code/asm order.
0 commit comments