-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize json parsing for faster indexing in elasticsearch #6130
Comments
I'm lacking context here. What are the ideas around this? |
I don't currently have any "optimization ideas". |
We use the json package from the std lib
not sure how we can improve that |
I imagine it's the HTML parsing with pyquery that is the slow part. |
I think most of it is because the HTML parsing but we can also use a faster |
pyquery by default uses and it is quite slow. Edit: html5_parser doesn't seem to work as expected. 😕 |
Believe we fixed this already 👍 |
Optimizing the json parsing will result in faster indexing/reindexing.
https://github.com/readthedocs/readthedocs.org/blob/master/readthedocs/search/parse_json.py
The text was updated successfully, but these errors were encountered: