You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is already done for nesting_limit. Each of the options should be transformed into C objects during init so that _traverse_node() uses as few Python objects as possible.
The gumbocy.html file generated after make cythonize is useful for seeing which lines use Python objects.
What would be the most efficient C type for lookups, to replace the Python sets like attributes_whitelist?
The text was updated successfully, but these errors were encountered:
Most of the options have been converted to C variables.
There are probably some more optimizations left in the parsing of CSS class names (split in C instead of using Python's re?), but we should do more profiling first to see where the real bottlenecks are.
From my tests, >80% of the time is usually spent in gumbo.parse, not sure what we can do about it but look upstream for the largest speedups.
A huge general speedup was gained thanks to #8, but it also re-introduced a lot of Python objects in the code.
There are a bunch of places where we go through Python strings for instance just to lowercase them. The attribute values are also stored as a Python dict, but a C++ map would probably be much faster (mostly because it would keep all its values as char*).
It is already done for
nesting_limit
. Each of the options should be transformed into C objects during init so that_traverse_node()
uses as few Python objects as possible.The
gumbocy.html
file generated aftermake cythonize
is useful for seeing which lines use Python objects.What would be the most efficient C type for lookups, to replace the Python sets like
attributes_whitelist
?The text was updated successfully, but these errors were encountered: