Bug fix for previous erroneous pull request, plus further optimisation #2

slowriot · 2014-11-27T05:32:05Z

I found and resolved the issue that caused my previous untested pull request to fail (10bb7ca). I realise you said that your choice of post-increment and less-than comparison is a matter of style, but I've now benchmarked this fork against the original, and I consistently get runtimes taking 97% the time of the original to carry out the unit test (timed in batches of 100), both at optimisation levels -O0 and -O3, meaning these changes represent an approximately 3% speedup even with optimisation.

Not all of the changes will affect performance (mostly the loop comparisons will), so feel free to merge only part of this if you really feel strongly about keeping post-increments, although I personally think it's a matter of good style to prefer post-increment in all cases when working with C++ on principle.

…!= where safe and pre-increment wherever return value is not used

…ements with their pre-increment and pre-decrement equivalents wherever the return value isn't used

# By Lode (10) and others # Via Lode * 'master' of https://github.com/lvandeve/lodepng: fix for C fix examples better bitpointer checks small tweaks Update lodepng.h Update lodepng.cpp avoid too big pixel sizes predict idat size correctly for interlaced images unit test was a bit slow like that fix bug with encoding transparent single-pixel image protect against invalid chunk lengths in some tools fix pngdetail.cpp collapsing duplicate branch do not check unsigned windowsize for < 0 Conflicts: lodepng.cpp

…rting with 0

lvandeve · 2014-11-27T10:57:54Z

Thanks a lot for your work! I'd love to integrate it, it still doesn't compile with C90 though. It reallly has to. Thanks :)

lvandeve · 2014-11-27T18:50:29Z

I already integrated 3b22b26 and 4547bfa. If this pull request would contain only the changes about the < to != and postincrement to deincrement, but not the change that breaks it for C90, and not the things already done, I'd gladly accept it. Thanks a lot!

slowriot · 2014-11-27T22:10:14Z

Sure, no worries. I've got an idea, will push a C90 compatible update to this pull request soon :)

# By Lode # Via Lode * https://github.com/lvandeve/lodepng: various fixes Conflicts: lodepng.cpp

slowriot · 2014-11-28T01:24:44Z

The version in this pull request should now work with ANSI C, C99, C11 and C++. If there are any other tests it fails that I haven't tried, please let me know what command you're testing with and I'll resolve those as well.

lvandeve · 2014-11-28T12:50:11Z

Hi,

What is the reason for moving size_t into the for loop? Does it make it faster?

I don't like the ifdefs... because now the declarations of i are there twice! It doesn't look that clean. I would prefer that the change involving moving variable declarations to non-C89 compatible places would not be there at all, ifdef protected or not.

I would love to pull in your change about preincrement and != instead of <.

Is it possible to provide a pull request which has only the preincrement and != change, but nothing else? It would be nice for me to have a choice of what to pull in. This pull request comes with quite some history... I reallly appreciate your work and like parts of the change, I just wouldn't want to pull in things I don't like. Remember I also already got the first two parts :)

Thanks a lot!

slowriot · 2014-11-28T16:27:34Z

A variable declared in the loop will have its scope within that loop; generally keeping scopes as narrow as possible is preferred, as it allows the compiler to optimise its register uses as much as possible. The fewer registers available on a given system, and the more variables you're using, the more important this becomes - otherwise variables have to be swapped out of registers and into memory and back more frequently, which can be a big drag. Often the compiler will do a decent job of optimising this away for you, but it never hurts to write things in the right way to begin with.

If you don't like the define switches that's fair enough, although I do find it a little odd considering how many define switches this same file and its header already have... it also seems strange to me to refer to clearly marked #ifdefs as not looking clean in this context, when half of the header file already hinges on __cplusplus ifdefs, and especially when the procedure for compiling in C is for the user to manually rename the entire file. But finally it seems the most strange to me to prefer a potential slowdown to adding a couple of lines of code, in an implementation that is performance-critical.

But it's your code and you're welcome to pull in what you do and don't want, of course! To make it easier I'll create a branch that's purely C89 compatible without the switches. But I certainly won't be running that branch myself :)

slowriot added 10 commits November 7, 2014 02:44

do not check unsigned windowsize for < 0

3b22b26

collapsing duplicate branch

4547bfa

reducing scope of loop counters, and micro-optimising loops by using …

daac255

…!= where safe and pre-increment wherever return value is not used

optimisation: replacing all unnecessary post-increments and post-decr…

a85479d

…ements with their pre-increment and pre-decrement equivalents wherever the return value isn't used

oops, typo fix

8a77a04

another typo fix

0696a29

using size_t iterator when comparing to size_t

e141374

fixing preincrement bug introduced previously

10bb7ca

more != comparisons instead of < against unsigned types for loops sta…

2c93fa5

…rting with 0

slowriot added 2 commits November 28, 2014 01:16

ensuring ANSI C compatibility by using ifdefs where appropriate

2662c53

Merge https://github.com/lvandeve/lodepng

f1de0a2

# By Lode # Via Lode * https://github.com/lvandeve/lodepng: various fixes Conflicts: lodepng.cpp

slowriot closed this Nov 28, 2014

slowriot mentioned this pull request Nov 28, 2014

ANSI C only version of previous pull request #3

Merged

seungwoos mentioned this pull request Jan 3, 2020

pngdetail hangs at lodepng_strlen() #123

Closed

bhaller mentioned this pull request Oct 13, 2020

RGB converts to grayscale by taking the R channel, in contradiction to what the doc states #139

Open

Cvjark mentioned this pull request May 24, 2022

vulnerability Discover #165

Closed

This was referenced Oct 21, 2022

Memory leaks in function benchmark. #176

Open

SEGV on unknown address in function pngdetail. #177

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug fix for previous erroneous pull request, plus further optimisation #2

Bug fix for previous erroneous pull request, plus further optimisation #2

slowriot commented Nov 27, 2014

lvandeve commented Nov 27, 2014

lvandeve commented Nov 27, 2014

slowriot commented Nov 27, 2014

slowriot commented Nov 28, 2014

lvandeve commented Nov 28, 2014

slowriot commented Nov 28, 2014

Bug fix for previous erroneous pull request, plus further optimisation #2

Bug fix for previous erroneous pull request, plus further optimisation #2

Conversation

slowriot commented Nov 27, 2014

lvandeve commented Nov 27, 2014

lvandeve commented Nov 27, 2014

slowriot commented Nov 27, 2014

slowriot commented Nov 28, 2014

lvandeve commented Nov 28, 2014

slowriot commented Nov 28, 2014