Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve bat startup speed #951

Closed
sharkdp opened this issue Apr 26, 2020 · 3 comments · Fixed by #2181
Closed

Improve bat startup speed #951

sharkdp opened this issue Apr 26, 2020 · 3 comments · Fixed by #2181
Labels
help wanted Extra attention is needed performance
Milestone

Comments

@sharkdp
Copy link
Owner

sharkdp commented Apr 26, 2020

The startup speed of bat is currently (v0.15) around 50 ms, which is on (or even past) the edge of being noticeable by humans.

It would be great if this situation could be improved.

The startup speed of bat can be measured with hyperfine:

hyperfine \
  --warmup 10 \
  --export-markdown bat_startup.md \
  'bat --no-config --color=always'

On my laptop, this results in:

Command Mean [ms] Min [ms] Max [ms] Relative
bat --no-config --color=always 45.3 ± 0.7 44.0 47.4 1.00

If we use perf to get a profile

perf record --call-graph dwarf bat --no-config --color=always < /dev/null

image

we can see that most of the time is spent deserializing the stored syntaxes and themes via bat::assets::assets_from_cache_or_binary.

There are several ideas that come to mind which could improve the situation:

  • Somehow parallelize the deserialization process (maybe by splitting up the binaries into smaller chunks. After all, they are made out of ~100 different syntaxes and themes)
  • Only load syntaxes and themes "on demand". If syntaxes and themes were serialized separately, we might be able to load them on demand. It's probably not as easy as it sounds, because syntaxes are interlinked (code blocks in markdown files could require the loading of all kinds of syntaxes, for example).
  • Try to look into ways to improve the deserialization speed.
  • The assets are currently stored in compressed form. I think I have tried storing them in uncompressed form once (and didn't see any performance improvements), but it might be worth re-evaluating that.

We should also validate that we are indeed CPU bound here. The bat binary is quite large, due to the fact that all syntaxes and themes are included within the binary. It might take some time to simply load that from disk (or cache).

@sharkdp sharkdp added help wanted Extra attention is needed performance labels Apr 26, 2020
@eth-p
Copy link
Collaborator

eth-p commented Apr 27, 2020

Adding some profiling data from MacOS Mojave (with a SSD):

Note: The numbers are going to be off (i.e. about 2x worse) due to the added overhead from running a profiler.

Build: RUSTFLAGS=-g cargo build --release
Command: target/release/bat --no-config --paging=never empty.cs

Total:
image

Syntax Loading:
image

Disk Latency:
image

Disk Usage:
image

Allocations:
image


It looks like around 31% of the time spent starting up was loading the executable image and initializing it (on MacOS), and around 63% of the total time (95% of main()) was parsing the syntax set, with 42% (60% of main()) being spent deserializing data using serde. I don't quite know where the extra 26% went though, unfortunately.

Meanwhile (based on the disk usage graph), only 2 milliseconds were spent paging in the executable.

I'm also inclined to say that the size of the executable/time spent paging isn't too much of an issue for us currently. 2 milliseconds loading the serialized asset data is significantly less than the 108 milliseconds spent deserializing it.

Additional things to note:
Although I passed --no-config as a command line flag, bat still opened the config file. Was that intended?

@Enselic
Copy link
Collaborator

Enselic commented Jul 18, 2021

I have made some progress on improving bat startup time through some prototyping work, and I would like to share that prototype here. I would love to get some feedback on it. Especially critisism!

The code for the prototype is here: https://github.com/Enselic/bat/compare/6ef2bb3283e1ba5f41316...Enselic:startup-speed-prototype-v1?expand=1

First, allow me to present some performance numbers from my (low-end) machine:

File under ./tests/syntax-tests/source bat master my prototype
example.zig 108.5 ms 25.8 ms
example.xml 102.1 ms 45.1 ms
example.md 123.1 ms 57.5 ms

One major limitation of the prototype is that I only made it work for
--language, and not for file extension or first-line. So for example, I use this command to benchmark .xml:

hyperfine 'bat --no-config --pager=never --color=always ./tests/syntax-tests/source/XML/example.xml --language xml'

It should be pretty straightforward to make it work for e.g. file extensions
too, but I don't want to spend time on it until I have gotten an external sanity
check on my overall approach. For now, I simply fallback to loading the full SyntaxSet in these cases.

We can see from the benchmark that small syntaxes such sa Zig, which only contains the Zig Syntax Definition, is pretty fast. Medium sized SyntaxSet such as XML (contains ["xml", "xsd", "xslt", "tld", "dtml", "rng", "rss", "opml", "svg"]) is slower, and even larger ones such as Markdown is even slower. But still a nice improvement I would say!

Some notable positive properties of the prototype:

  • All non-ignored cargo test tests passes.
  • Most (but not all) syntax regression tests passes.

Some notable negative properties:

  • The stripped release binary size grows from 4,7M to 7,7M. Can probably be optimized quite a bit though.
  • Only syntaxes embedded in the binary works, but should be straightforward to make it work for locally built cache too. I have prepared the code a bit for that.
  • I disregard any public API stability concerns or other backwards compatibility (e.g. asset metadata)

So, how does the prototype work? Roughly like this:

  • There are are two new assets. independent_syntax_sets.bin contains concatenated binary representations of independent syntax sets, and independent_syntax_sets_map.bin contains a small lookup datastructure so they can be found again.
  • The new assets are generated by analyzing dependencies between SyntaxDefinitions and creating independent SyntaxSets. By loading only the necessary linked syntax definitions, startup time is greatly improved.

(I realize I should write a lot more details of how it works, but I unfortunately don't have the time for that right now. The code is there for anyone to poke around with though :))

My current plan forward is to take the code from the prototype and turn it into several small and independent PRs that are easy to review and understand one by one, to the extent that is possible and makes sense.

Enselic added a commit to Enselic/bat that referenced this issue Jul 18, 2021
Since we only modify `pub(crate)` items, the stable bat-as-a-library API is not
affected.

This takes us one step closer to making SyntaxSet lazy-loaded, which in turn
takes us one step closer to solving sharkdp#951.
Enselic added a commit that referenced this issue Jul 19, 2021
Since we only modify `pub(crate)` items, the stable bat-as-a-library API is not
affected.

This takes us one step closer to making SyntaxSet lazy-loaded, which in turn
takes us one step closer to solving #951.
@Enselic
Copy link
Collaborator

Enselic commented Jul 26, 2021

Just wanted to elaborate a bit on the next steps I intend to take on the prototype. I will keep updating this comment.

I plan on doing the following sequential steps:

Enselic added a commit to Enselic/bat that referenced this issue Dec 1, 2021
This is for sharkdp#951. Syntax lazy-loading will come later and have a much bigger
impact, but lazy-loading of themes does not have negligible impact.
Enselic added a commit that referenced this issue Dec 6, 2021
This is for #951. Syntax lazy-loading will come later and have a much bigger
impact, but lazy-loading of themes does not have negligible impact.
@Enselic Enselic added this to the v0.21.0 milestone Mar 29, 2022
Enselic added a commit to Enselic/bat that referenced this issue May 4, 2022
Enselic added a commit that referenced this issue May 7, 2022
* Bump to syntect 5.0.0 to e.g. start lazy-loading themes

Closes #915
Closes #951
Closes #1846
Closes #1854

* Typo fix formated -> formatted

* Update CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants