-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OPTIMISATION] Improving argument parsing #2283
Comments
@PierreQuentel I started to rewrite Note: We can't give positional and It seems currently x2 to x5 faster than current implementation. Do you see any issues with this parser (cf below) ? function SimpleParser(fct, args) {
const result = {};
const kargs = args[args.length-2];
const kwargs = args[args.length-1];
const args_names = fct.$infos.arg_names;
const args_defaults = fct.$infos.__defaults__;
const default_offset = args_names.length - args_defaults.length;
const varargs_name = fct.$infos.vararg;
const kwargs_name = fct.$infos.kwarg;
const max = Math.max( args.length-2, args_names.length );
// positional parameters...
let offset = 0;
for( ; offset < max ; ++offset)
result[ args_names[offset] ] = args[offset];
// vararg parameter
if( varargs_name !== null )
result[varargs_name] = args.slice( args_names.length, -2 );
// positionnal only
if( kargs === null && kwargs === null ) {
if( default_offset < offset )
throw new Error('XXX');
// default parameters
for(let i = offset - default_offset;
i < args_defaults.length;
++i)
result[ args_names[offset++] ] = args_defaults[i];
if( kwargs_name !== null )
result[kwargs_name] = __BRYTHON__.obj_dict({});
return result;
}
//TODO: named / **args arguments
//TODO: *arg / **args parameters
return result;
} |
@PierreQuentel Normally I should have a complete version now (?). I didn't fully tested it. Do you see an issue with this implementation ? On the very limited test I made, I am generally x1.7 faster than current implementation. In fact it is even faster as my measures include the time taken by the loop (so in fact I am normally ~x1.85 faster). Some variables can still be precomputed to speed things up, and maybe some little tweaks could increase performances a little more :
There are some changes for function calls. The last arguments is always named arguments, therefore we don't need the strange Here the web page I used to test it : test.html.zip And here my argument parser : function SimpleParser(fct, args) {
const result = {};
//TODO: rename :
// - args = when calling the function.
// - params = when declaring the function.
const args_names = fct.$infos.arg_names;
const nb_pos_params = args_names.length;
const varargs_name = fct.$infos.vararg;
const kwargs_name = fct.$infos.kwarg;
const nb_pos_args = args.length-1;
const min = Math.min( nb_pos_args, nb_pos_params );
const kargss = args[nb_pos_args];
// positional parameters...
let offset = 0;
for( ; offset < min ; ++offset)
result[ args_names[offset] ] = args[offset];
// vararg parameter
//TODO: NOT SURE FOR kargss !!!!
if( varargs_name !== null )
result[varargs_name] = args.slice( nb_pos_params, -1 );
else if( nb_pos_args > nb_pos_params ) {
throw new Error('Too much pos parameters');
}
// positionnal only
if( kargss === null ) {
const args_defaults = fct.$infos.__defaults__;
const nb_defaults = args_defaults.length;
const default_offset= nb_pos_params - nb_defaults;
if( default_offset < offset )
throw new Error('Not enough pos parameters');
// default parameters
for(let i = offset - default_offset;
i < nb_defaults;
++i)
result[ args_names[offset++] ] = args_defaults[i];
if( kwargs_name !== null )
result[kwargs_name] = __BRYTHON__.obj_dict({});
return result;
}
const kargs_names = fct.$infos.karg_names ?? args_names.slice(); // TODO: missing !!!
const keys = kargs_names.slice(offset);
// other structs possibles ???
const extra = {};
for(let id = 0; id < kargss.length; ++id ) {
const
[test.html.zip](https://github.com/brython-dev/brython/files/13060472/test.html.zip)
kargs = kargss[id];
for(let argname in kargs) {
let i = keys.indexOf(argname);
if( i === -1) {
if( kwargs_name === null )
throw new Error('Unfound named parameters or duplicate');
// not quite optimized for *kwargs parameters...
// no need for keys indexOf if *kwargs parameters.
if(argname in result || argname in extra)
throw new Error('Defined many times !');
extra[argname] = kargs[argname];
continue;
}
result[ argname ] = kargs[argname];
//delete keys[i];
keys[i] = null;
}
}
if( kwargs_name !== null )
result[kwargs_name] = __BRYTHON__.obj_dict(extra);
const kargs_defaults= fct.$infos.__kwdefaults__;
// checks default values...
for(let ioffset = 0; ioffset < keys.length; ++ioffset) {
const key = keys[ioffset];
if( key !== null ) {
if( ! (key in kargs_defaults ))
throw new Error('missing values !!');
result[key] = kargs_defaults[key];
}
}
return result;
} |
Hmmm... it seems I can be even faster for named argument processing :
This could be tested. But I think, first, we should determine whether the proposed implementation is correct. |
Still doing some tests (but I'll try to stop now xD).
Sources: parse_args.zip See also : https://jfmengels.net/optimizing-javascript-is-hard/ |
Denis, I'm sorry but I need a Pull Request with a complete replacement for argument parsing, that is, a new version of the functions in Otherwise it's impossible to know if your version correctly covers all the cases, and comparing its speed to that of the current implementation won't be useful. |
Something isn't right. I downloaded the zip from github, remplaced Are the tests relying more on EDIT: I throw an exception in my new function and still pass all the tests. |
Also :
|
I don't know what is wrong here. You don't have to run Can you call /src/py_utils.js in the address bar to see if it is the version you have modified ? The error message in |
It seems the files are cached. I have to go to Now I get some errors :) |
It seems to work quite well (for now I got an error at line 529 of the basic test suite). Got few small errors to fix :
My code may be a little slower as I don't have access to some pre-computed data in the format I'd need, so I have to compute them when parsing arguments. For now, I also use the original So we'll have to discuss a little once I'd pass all tests ;). I also have one question :
|
Another question (sorry I am not quite used to Python) :
|
Yeah I'm stuck at 1043 of the basic test suite because of this >>> class X:
... def foo(self, **args):
... print(args)
...
>>> x = X()
>>> x.foo(self=x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: X.foo() got multiple values for argument 'self'
>>> class X:
... def foo(self, /, **args):
... print(args)
...
>>> x = X()
>>> x.foo(self=x)
{'self': <__main__.X object at 0x7feb070463e0>} EDIT: I think I understand it now. |
Your guess is correct. The reference here is the Language Reference, specifically the section about function definition. |
Thanks. I'm struggling to fix little details like this xD. |
^^ Normal time with the current implementation : ~45 s. However, comparison isn't fair :
|
Currently, before any code cleaning, I am (on the very limited tests I made) :
But as I said before, tests are biased against my method as I test them first. |
Added a little shortcut, now x1.3 to x1.7 faster for positional arguments (ofc on the very limited speed tests I made). Could be faster if I pre-computed stuff upon function creation. EDIT: Also could remove some tests if functions had several |
@PierreQuentel What should be the next step knowing that :
|
Great news Denis ! Can you submit a PR so that I can test what you have done so far ? |
Ok. Tomorrow I'll rename some variables and add some comments, before submitting a PR. |
PR made. |
Improve arguments parsing (see #2283)
To speed up even more performances :
To have cleaner code :
|
Having a different version of def f(x):
pass
try:
f() # here, f() has no defaults
raise Exception('should have raised TypeError')
except TypeError:
pass
f.__defaults__ = (0,)
# f() now has a default value for x, so f() no more raises TypeError
f() |
WTF Python xD. Well, it could be possible if |
|
If we do such check before each call of I think it'd be better to initially set a |
Agreed. In fact, a specific, optimized argument parsing function could be created for each function and updated if |
Thanks. Mmm... I think I might be able to do it. |
Okay, here my road map :
|
What do you mean by "attribute getter" ? If it is attribute resolution (compute the result of |
Sorry, it was a typo. I meant the setter for |
Génial ! I could reproduce the same results, around 25% faster than the current implementation. The next step would be to set the parsing function, after the section in // Set parsing function
// Compute arguments required for generate_args0, based on those defined
// in this function (has_posonlyargs, _defaults, kw_defaults, etc.)
let hasPosOnly = ...,
js += `${name2}.arg_parser = $B.generate_args0(${hasPosOnly}, ...)\n` Setting function defaults is done by |
It doesn't work for the function If I set EDIT: but if I put it in |
I passed all unit test. @PierreQuentel I have several questions :
|
Okay, I think I'm done for the current PR. |
It's dangerous to access the dictionary keys / values from I found another issue with your new implementation. In Python, an instance of a class with methods def f(**kw):
return kw
class D:
def keys(self):
yield 'a'
def __getitem__(self, key):
return 99
d = D()
result = f(**d)
assert result == {'a': 99}, result |
Ok, then I'll pre-compute a
I assume dictionaries also have this It'll be slower when parsing |
Done. Maybe not in the most optimal way to loop over dict elements for I may be wrong, but I don't think you have an unit test for when we give a dict with non-string key as a |
Could you make a PR to fix this issue in the current development version ? |
Wouldn't it be better to directly merge #2316 as it seems faster in all conditions (-5% to -10% in total exec time / -20% when no heavy browser opti) ? I just need to remove "USE_PERSO_ARGS0_EVERYWHERE" condition once you validate it. |
Done. |
|
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Tsk, previous idea (I hid it) won't work as functions can be substituted/redefined in Python... Such a shame, this could have lead to precomputation of part of the result which would make argument parsing almost without any cost in some cases... Python really have rules that prevents lot of optimizations... |
Okay, still, I got a great idea of optimization for calls using named arguments. Such calls is written as : $B.$call(....)(a,b, {$kw: [{}, ...]}) Meaning that, when we do not have a So, inside the parser, instead of having to create a new This can possibly generate a big speed up. Of course, we could even rewrite function calls as : $B.$call(....)(a,b,
2, // pre-compute number of named arguments
{a: 32, b:34}, // named arguments
null // **kargs arguments, can be [...] if there are some
) Sure, it'll create an object even when we do not use named arguments, BUT, we'd be able to recycle it if function doesn't accept
EDIT: not a good idea, modification if forbidden in strict mode. |
… (reduces size of generated Javascript). Adapt $B.make_function_defaults to JS coding style; remove f.$defaults. Related to issue #2283.
In the commit above and the previous ones I have reduced the size of generated Javascript code for functions by delegating the creation of |
Cool. I have currently some work to do + some personal matter so I'm not quite available theses days. But once I'll have a little time I'll add the little opti I though for named arguments ;). |
When publishing 3.12.1 and running the speed test script, I observed that function creation was slower than before, which is logical since the function argument parser is built on function creation, and it takes a little time. In commit 4f5bc28 I have slightly modified the feature : the parser is only built the first time the function is called. With this change, function creation is now faster, and execution is the same. |
There are several things :
Please note that
Hum, you add a condition ( Maybe you can do it like that to remove the condition: function make_args_parser_then_parse(fct, args) {
fct.args_parser = $B.make_args_parser(fct); // replace the initial fct.args_parser by the newly created parser.
return fct.args_parser(fct, args); // you can do it in one line if you want.
}
// create a new function:
function foo() {} // do stuff
foo.args_parser = make_args_parser_then_parse; // called upon first parsing, create the parser then parse.
// consecutive parsing will call directly the parser. In the same way, maybe there is a way to build the function // create a new function:
function foo() {} // do stuff
Object.defineProperty("$infos", {
// some config.
get: function() { const $infos = createInfos(fct); Object.defineProperty("$infos", $infos); return $infos; }
}); Something like that. Then |
Hum Do we need I'll also have to do some benchmarks to see where this extra-cost comes from. |
Made a PR ( #2336 ) for the little opti I though about (x1.11 in parsing of named argument when no |
Current tasklist: #2283 (comment)
See [2275] for other optimizations about function calls.
========================================================
I started working on the optimisation of
$B.args0()
.I will edit this first message in the future to give a link to a github dedicated to this project + to give a summary of the results and the current progression of the project.
I will use the messages below to discuss about this issue.
Some ideas here.
The text was updated successfully, but these errors were encountered: