Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix for previous erroneous pull request, plus further optimisation #2

Closed
wants to merge 12 commits into from

Conversation

slowriot
Copy link
Contributor

I found and resolved the issue that caused my previous untested pull request to fail (10bb7ca). I realise you said that your choice of post-increment and less-than comparison is a matter of style, but I've now benchmarked this fork against the original, and I consistently get runtimes taking 97% the time of the original to carry out the unit test (timed in batches of 100), both at optimisation levels -O0 and -O3, meaning these changes represent an approximately 3% speedup even with optimisation.

Not all of the changes will affect performance (mostly the loop comparisons will), so feel free to merge only part of this if you really feel strongly about keeping post-increments, although I personally think it's a matter of good style to prefer post-increment in all cases when working with C++ on principle.

…!= where safe and pre-increment wherever return value is not used
…ements with their pre-increment and pre-decrement equivalents wherever the return value isn't used
# By Lode (10) and others
# Via Lode
* 'master' of https://github.com/lvandeve/lodepng:
  fix for C
  fix examples
  better bitpointer checks
  small tweaks
  Update lodepng.h
  Update lodepng.cpp
  avoid too big pixel sizes
  predict idat size correctly for interlaced images
  unit test was a bit slow like that
  fix bug with encoding transparent single-pixel image
  protect against invalid chunk lengths in some tools
  fix pngdetail.cpp
  collapsing duplicate branch
  do not check unsigned windowsize for < 0

Conflicts:
	lodepng.cpp
@lvandeve
Copy link
Owner

Thanks a lot for your work! I'd love to integrate it, it still doesn't compile with C90 though. It reallly has to. Thanks :)

@lvandeve
Copy link
Owner

I already integrated 3b22b26 and 4547bfa. If this pull request would contain only the changes about the < to != and postincrement to deincrement, but not the change that breaks it for C90, and not the things already done, I'd gladly accept it. Thanks a lot!

@slowriot
Copy link
Contributor Author

Sure, no worries. I've got an idea, will push a C90 compatible update to this pull request soon :)

@slowriot
Copy link
Contributor Author

The version in this pull request should now work with ANSI C, C99, C11 and C++. If there are any other tests it fails that I haven't tried, please let me know what command you're testing with and I'll resolve those as well.

@lvandeve
Copy link
Owner

Hi,

What is the reason for moving size_t into the for loop? Does it make it faster?

I don't like the ifdefs... because now the declarations of i are there twice! It doesn't look that clean. I would prefer that the change involving moving variable declarations to non-C89 compatible places would not be there at all, ifdef protected or not.

I would love to pull in your change about preincrement and != instead of <.

Is it possible to provide a pull request which has only the preincrement and != change, but nothing else? It would be nice for me to have a choice of what to pull in. This pull request comes with quite some history... I reallly appreciate your work and like parts of the change, I just wouldn't want to pull in things I don't like. Remember I also already got the first two parts :)

Thanks a lot!

@slowriot
Copy link
Contributor Author

A variable declared in the loop will have its scope within that loop; generally keeping scopes as narrow as possible is preferred, as it allows the compiler to optimise its register uses as much as possible. The fewer registers available on a given system, and the more variables you're using, the more important this becomes - otherwise variables have to be swapped out of registers and into memory and back more frequently, which can be a big drag. Often the compiler will do a decent job of optimising this away for you, but it never hurts to write things in the right way to begin with.

If you don't like the define switches that's fair enough, although I do find it a little odd considering how many define switches this same file and its header already have... it also seems strange to me to refer to clearly marked #ifdefs as not looking clean in this context, when half of the header file already hinges on __cplusplus ifdefs, and especially when the procedure for compiling in C is for the user to manually rename the entire file. But finally it seems the most strange to me to prefer a potential slowdown to adding a couple of lines of code, in an implementation that is performance-critical.

But it's your code and you're welcome to pull in what you do and don't want, of course! To make it easier I'll create a branch that's purely C89 compatible without the switches. But I certainly won't be running that branch myself :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants