-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize accessing PNG dimensions #600
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I've noticed that a lot of the parsing is spent figuring out the dimensions of our icons, which very often are PNG images. Apparently to access the full header of a PNG file a lot of decoding needs to happen, which is apparently quite expensive. However we don't really need to access the full header and instead really only the width and height. And here the PNG specification comes in handy: https://www.w3.org/TR/2003/REC-PNG-20031110/ It says the following: > A valid PNG datastream shall begin with a PNG signature, immediately > followed by an IHDR chunk Each chunk is encoded as a length and type and then its chunk encoding. An IHDR chunk immediately starts with the width and height. This means we can model the beginning of a PNG file as follows: ```rust // 5.2 PNG signature png_signature: [u8; 8], // 5.3 Chunk layout chunk_len: [u8; 4], // 11.2.2 IHDR Image header chunk_type: [u8; 4], width: U32, height: U32, ``` And with a little bit of bytemuck magic we essentially can access the width and height with a single read. This improves parsing speed of entire splits files by up to 30%. Yes, the old code was that slow.
CryZe
added
enhancement
An improvement for livesplit-core.
performance
Affects the performance of the code.
parsing
This is about one of the parsers.
labels
Nov 10, 2022
CryZe
commented
Nov 10, 2022
@@ -433,7 +433,7 @@ fn parse_attempt_history(version: Version, reader: &mut Reader<'_>, run: &mut Ru | |||
pub fn parse(source: &str, path: Option<PathBuf>) -> Result<Run> { | |||
let reader = &mut Reader::new(source); | |||
|
|||
let mut image_buf = Vec::with_capacity(4096); | |||
let mut image_buf = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also snuck this in. Nowadays we know exactly how much memory we need for each image in here when we encounter it, so there's no need to preallocate anything if we are going to reallocate the memory anyway the moment we encounter an image.
CryZe
added a commit
that referenced
this pull request
Dec 29, 2022
- The `livesplit-hotkey` crate is now documented. (@CryZe) [#479](#479) - Not every key press emits a scan code on Windows. For those the virtual key code is now translated to a scan code. (@CryZe) [#480](#480) - Time parsing is now a lot more robust, handles more edge cases, and is also a lot more accurate. (@CryZe) [#483](#483) and [#578](#578) - When parsing a GDI based font name, platforms other than Windows now don't attempt to parse "normal" as part of the font name anymore as it is too ambigious. It could either refer to a font weight or stretch. (@kadiwa4) [#487](#487) - The text engine can now be customized. You can either provide your own text engine or use the one provided by `livesplit-core`. The one provided is now behind the `path-based-text-engine` and converts all glyphs to paths that can easily be drawn. (@CryZe) [#495](#495) - The path based text engine now caches the width of digits for tabular numbers, as well as the ellipsis glyph and its width, so that they can be layed out faster. (@kadiwa4) [#490](#490) and [#499](#499) - On Windows GDI is now used to resolve GDI based font names. (@CryZe) [#500](#500) - (Total) Possible Time Save now properly indicates that it's updating frequently. This results in faster rendering times. (@kadiwa4) [#501](#501) - Initial support for auto splitting has landed in `livesplit-core`. Auto splitters are provided as WebAssembly modules. Support can be activated via the `auto-splitting` feature. (@P1n3appl3) [#477](#477) - Auto splitting is also supported via the C API when activating its `auto-splitting` feature. (@DarkRTA) [#503](#503) - A watchdog for the Auto Splitting Runtime was added which unloads scripts that aren't responsive. (@CryZe) [#528](#528) - Splits and layouts can now be parsed and saved on `no_std` platforms. (@CryZe) [#532](#532) - The splits component column labels can now be queried via the C API. (@MichaelJBerk) [#526](#526) - The Software Renderer is now supported on `no_std` platforms. (@CryZe) [#536](#536) - The parsers are now faster because they don't allocate as much memory anymore. (@CryZe) [#546](#546) - The auto splitters have unstable support the `WebAssembly System Interface` via the `unstable-auto-splitting` feature. (@CryZe) [#547](#547) - The Timer component can now use the color of the delta for its background. (@Hurricane996) [#539](#539) - The splits component now takes the font into account when calculating the width of the columns. (@Hurricane996) [#550](#550) - The `Resource Allocator` now decodes the images, allowing the underlying renderer to do the encoding by itself. (@CryZe) [#562](#562) - Cargo's `--crate-type` parameter is now used to build the C API. (@CryZe) [#565](#565) - The columns of the splits component can now show the custom variables. (@CryZe) [#566](#566) - On the web, the `keydown` event may not always pass a `KeyboardEvent` despite the specification saying that this should be the case. This is now properly handled. (@CryZe) [#567](#567) - An integer overflow in the `FuzzyList` used for searching game and category names has been fixed. (@CryZe) [#569](#569) - The way the background is handled in the Detailed Timer component has been fixed. (@CryZe) [#572](#572) - The times are now formatted as strings without going through floating point numbers which increases both the correctness and the performance. (@CryZe) [#576](#576) - Instead of using `core::fmt` formatting machinery to format the times as strings, we now use a custom implementation that's much faster. (@CryZe) [#577](#577) and [#580](#580) - Holding down a hotkey on Windows now doesn't cause it to be triggered over and over again. Other platforms already behaved this way. (@CryZe) [#584](#584) - The `base64` crate is now replaced with `base64-simd` which uses SIMD to speed up the decoding of the images. (@CryZe) [#585](#585) - Splits from `SpeedRunIGT`, which is a Minecraft speedruning mod, can now be parsed. (@CryZe) [#591](#591) - It turns out using `evdev` for the hotkeys on Linux requires the user to be in the `input` group, which is not always the case. Therefore we now fall back to `X11` if `evdev` is not usable. (@CryZe) [#592](#592) - When an auto splitter wants to attach to a Process by name, the start time and process id are now used to prioritize duplicate processes. (@Eein) [#589](#589) - It is now possible to resolve the key codes to the particular name of the key based on the current keyboard layout on Linux and the web. This was already the case on Windows and macOS. (@CryZe) [#594](#594) and [#595](#595) - It is now possible to trust the user of the C API to always pass valid UTF-8 strings to the C API via the optional `assume-str-parameters-are-utf8` feature. This is also always the case when using WebAssembly on the web. This improves the performance because no validation of the strings is necessary. (@CryZe) [#597](#597) - There is now a new `max-opt` cargo profile that can be used to maximally optimize the resulting executable. The release profile is now using its default configuration again. (@CryZe) [#598](#598) - When encountering images `livesplit-core` checks their dimensions to potentially automatically shrink them if they are larger than necessary. It turns out that checking the dimensions of PNG images was a lot less efficient than it could have been. This even improves parsing speed of entire splits files by up to 30%. (@CryZe) [#600](#600) - The documentation now uses links to types mentioned. (@Eein) [#596](#596) - Auto splitters can now query size of the modules of a process. (@CryZe) [#602](#602) - The log messages emitted by auto splitters can now be consumed directly instead of always being emitted via the `log` crate. (@CryZe) [#603](#603) - The auto splitters can provide settings that can be configured. For now the auto splitters need to be reloaded when the settings change. (@CryZe) [#606](#606) - The file path used to be tracked in the `Run`, but no frontend even used this. So it has been removed. (@CryZe) [#616](#616) - The documentation states that the title component's lines store the unabbreviated line as their last element. This was not actually the case and has been fixed. (@DarkRTA) [#615](#615)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
An improvement for livesplit-core.
parsing
This is about one of the parsers.
performance
Affects the performance of the code.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've noticed that a lot of the parsing is spent figuring out the dimensions of our icons, which very often are PNG images. Apparently to access the full header of a PNG file a lot of decoding needs to happen, which is apparently quite expensive. However we don't really need to access the full header and instead really only the width and height. And here the PNG specification comes in handy:
https://www.w3.org/TR/2003/REC-PNG-20031110/
It says the following:
Each chunk is encoded as a length and type and then its chunk encoding. An IHDR chunk immediately starts with the width and height. This means we can model the beginning of a PNG file as follows:
And with a little bit of bytemuck magic we essentially can access the width and height with a single read. This improves parsing speed of entire splits files by up to 30%. Yes, the old code was that slow.