Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for crash on windows in src/core/ev.c: initialze state->fromlen #1172

Merged
merged 1 commit into from
May 30, 2023

Conversation

zevv
Copy link
Contributor

@zevv zevv commented May 30, 2023

before doing WSARecvFrom() to prevent crash (likely caused by the memcpy() of state->from at line
2301 with the memcpy length set to -1)

before doing WSARecvFrom() to prevent crash (likely caused by the
memcpy() of `state->from` at line
2301 with the memcpy length set to -1)
@bakpakin bakpakin merged commit 961c6ea into janet-lang:master May 30, 2023
@primo-ppcg
Copy link
Contributor

I wonder if this might be an incorrect commit (or maybe not precisely correct?). This slows down both for and forv loops quite considerably on Windows (10-15% slower). I noticed that the bytecode_opt branch became slower on Windows after being merged into master, and was able to isolate it to this commit.

@zevv
Copy link
Contributor Author

zevv commented Jun 1, 2023

I'm not saying "impossible", but I would be highly surprised if this commit causes the behavior you see here, as this should really change only affect code calling net/recv-from;

@primo-ppcg
Copy link
Contributor

Current master:

(for i 0 500_000_000)
elapsed     00:00:04.4632351

With #1172 reverted:

(for i 0 500_000_000)
elapsed     00:00:03.9904488

I'm not saying "impossible", but I would be highly surprised if this commit causes the behavior you see here

I can't explain it either.

@zevv
Copy link
Contributor Author

zevv commented Jun 1, 2023

Wow, that is pretty weird and amazing. So if you run on master and just comment out line 2323 in src/core/ev.c (state->fromlen = sizeof(state->from);), you see this happen?

I can't seem to reproduce on neither Linux or cross-compiled Mingw though, so I'm not sure if I can help figure out what's happening here.

@primo-ppcg
Copy link
Contributor

primo-ppcg commented Jun 1, 2023

I can confirm.

Commenting out line 2323 on /src/core/ev.c, in current master, results in a 15% performance increase on the following code:

(var c (os/clock))
(for i 0 500_000_000)
(pp (- (os/clock) c))

@zevv
Copy link
Contributor Author

zevv commented Jun 1, 2023

Utter flabbergastery, especially since I am pretty much sure none of the code related to that fromlen is even running at all during your tests!

@sogaiu
Copy link
Contributor

sogaiu commented Jun 1, 2023

I might have done something wrong, but I don't get such consistent results.

I have compiled 2 different janet.exe binaries on Windows (using two different folders) and have 2 command prompt windows open, each one has a REPL session in it.

I put the code above in each window (though on a single line to get the timing to work out).

My results vary quite a bit for each one - sometimes one is faster than the other and the results can differ by 6 seconds or so occasionally.

I'll try again.

@zevv
Copy link
Contributor Author

zevv commented Jun 1, 2023

One possible reason can be that unrelated changes in the code can cause other code to be moved and linked at different addresses, resulting in some critical loop ending up "wrongly" aligned for your particular machine/architecture, causing things to fall on the wrong side of the cache lines. This can sometimes dramatically affect performance and is amazingly hard to debug, prove, reproduce or fix.

@sogaiu
Copy link
Contributor

sogaiu commented Jun 1, 2023

I wiped the build directories in each folder along with the janet.exe, janet.exp, and janet.lib files.

Then I used build_win.bat to compile.

I then started a REPL for each being careful to use .\janet.exe (and not some version of janet.exe that was already installed).

The code pasted in each REPL was:

(var c (os/clock)) (for i 0 500_000_000) (pp (- (os/clock) c))

For master (5de8894), I got:

  • 7.11291
  • 6.79481
  • 6.6602
  • 6.62547
  • 7.43754

With the line commented out I get:

  • 6.46684
  • 7.26258
  • 6.57898
  • 6.39238
  • 6.55173

Subsequent invocations on one side just didn't end, so I stopped collecting info.

Was reminded of this: https://www.youtube.com/watch?v=r-TLSBdHe1A

@zevv
Copy link
Contributor Author

zevv commented Jun 1, 2023

Profiling can be very tricky; one thing to look out for is to give your test enough time to "wake up" your machine - it might be in some power saving mode with reduced clock frequency and take some time to get up to full speed; One trick is to this is to repeat your tests for a number of times in a tight loop and ignore the first results.

@sogaiu
Copy link
Contributor

sogaiu commented Jun 1, 2023

Sounds awfully familiar :)

My machine for Windows is a pretty weak one.

I might try again tomorrow and try to collect info in a not-so-manual fashion.

@primo-ppcg
Copy link
Contributor

I can confirm.

Feel free to disregard. On a different windows machine, I'm unable reproduce. Possibly only affects windows 10, but likely just something about that machine.

@sogaiu
Copy link
Contributor

sogaiu commented Jun 1, 2023

Thanks for the update. Quite strange in any case!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants