Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: reduce the number of traversals through posts #5119

Merged
merged 1 commit into from
Nov 29, 2022
Merged

perf: reduce the number of traversals through posts #5119

merged 1 commit into from
Nov 29, 2022

Conversation

stevenjoezhang
Copy link
Member

@stevenjoezhang stevenjoezhang commented Nov 28, 2022

What does it do?

Calling post.tags[0].length is slow, because Hexo will try to fetch all posts associated with one tag, which is unnecessary.
Tested with 8000 posts (copy https://github.com/SukkaLab/hexo-many-posts 16 times, with https://github.com/hexojs/hexo-starter), this patch will reduce the time of hexo g from ~17min to ~11min

TODO: PostTag.find is also slow. Need to replace it with a better data structure in the future.
See #2579 (comment)
See also #3624

Screenshots

Pull request tasks

  • Add test cases for the changes.
  • Passed the CI test.

@github-actions
Copy link

github-actions bot commented Nov 28, 2022

@coveralls
Copy link

Coverage Status

Coverage increased (+0.002%) to 98.673% when pulling 5f42a03 on perf into ca51e15 on master.

@stevenjoezhang
Copy link
Member Author

Although this patch can mitigate performance issues, Hexo still spends a lot of time dealing with post tags when the site has a lot of posts.

As shown in the figure below, nearly half of the processing time is related to the tags.

截屏2022-11-28 23 11 19

It's difficult to make further optimizations for this issue with the current warehouse implementation. Perhaps we could consider supporting other database backends such as sqlite.

@SukkaW
Copy link
Member

SukkaW commented Nov 29, 2022

Although this patch can mitigate performance issues, Hexo still spends a lot of time dealing with post tags when the site has a lot of posts.

As shown in the figure below, nearly half of the processing time is related to the tags.

截屏2022-11-28 23 11 19

It's difficult to make further optimizations for this issue with the current warehouse implementation. Perhaps we could consider supporting other database backends such as sqlite.

IMHO warehouse is still our best possible solution.

warehouse put all the data in the memory, which makes it faster for smaller websites (which most Hexo-powered sites are), while it might not scale well with larger sites.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants