-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low recursion limit on Windows port with MSVC #2927
Comments
Cannot confirm. |
@robert-hh I assume you used the same code I used to test. That would only leave the compiler / build-environment to account for the difference. What compiler did you use to build micropython? I used a MSVC 2015 build. |
gcc (i686-win32-dwarf-rev1, Built by MinGW-W64 project) 6.2.0 |
@robert-hh I do not have MinGW installed and I do not think you are happy to test building with MSVC... Would you mind uploading your binary so I can test it on my machine? E.g. using reep.io . |
Recursion depth is controlled by stack limit - whatever fits into it, that you get. |
For x86 the stack limit set in main() is 40000 so that would mean every iteration of foo() in the sample code costs like 3000 bytes (if I understand the units correctly). That sounds like a lot, @ARF1 are you using a debug build maybe? Or maybe the stack check code makes assumptions about stack which depend on gcc but I doubt that, then nothing would work at all for msvc build I guess. Maybe msvc builds are just more stack hungry due to whatever optimizations applied, that's certainly possible no? |
@ARF1 : The Binary is at https://github.com/robert-hh/Shared-Stuff |
@robert-hh Thank you very much. It apparently really is the MSVC build environment that causes the low recursion depth:
Even after having looked into the MSVC build environment for quite a bit, I am absolutely stumped as to the cause of the issue. |
Btw release msvc builds gives 29 iterations so you most likely are using a debug build. |
@stinos You are right. With the MSVC release build I now also get 30 iterations. That solves my immediate problem, though I am struggling to understand where this discrepancy between the gcc build and the MSVC build originates. I observed that after starting micropython, Does this have anything to do with it? |
Look how much stack one function call takes, and you'll see how bad (good at consuming stack space?) msvc is. |
@pfalcon Using your code from this post, I found: gcc: 288 bytes per call This very nicely accounts for the difference in maximum recursion depth. Thanks you very much! Now the only question that remains: What is MSVC doing??? What is it storing on the stack that gcc isn't? |
Well it would certainly be inetersting to figure out the differences in stack layout with gcc. I did some quick searches but didn't find any relevant info. Anyway, I don't think this really is an issue. I had this problem before for larger scale applications and I now just set MICROPY_STACK_CHECK to 0 as I don't have any code relying on an exception being raised when some arbitrary limit is reached. If things really go out of hand you'll just get a stack overflow exception (in this case, after 700 iterations). If that is still not sufficient you could increase the stack size (add msvc/user.props with options as found in https://msdn.microsoft.com/en-us/library/tdkhxaks.aspx). If that's overkill, alternatively you could increase the limit used in main() or call mp_stack_set_limit() somewhere. |
Yeah, I'd definitely be interested to hear if someone told me. Though you may consider why you're so interested in MSVC and wouldn't just using Mingw be a better option. Maybe even tell other folks, so they know what you're trying to do. For example, @stinos works with MSVC and MicroPython at his dayjob, and that's why he maintains MSVC port. As a bit of extended trivia, uPy also supports "stackless" mode (similar to http://www.stackless.com) which trades stack allocation for heap allocation. |
I'm a bit intrigued by this so I thought if I could make a minimal C sample reproducing this (i.e. big difference in msvc vs gcc stack layout) it would be a good start to investigate this further. However as it turns out I don't know enough about this to do so. This:
when compiled with the same options as used when building uPy, shows 112 bytes per call for msvc but 176 bytes per call for gcc, so that's the opposite of what we get for python calls. Also interesting: when building uPy with VS2013 I get 25 iterations out of the uPy sample code from ARF1. Now just changing the optimization to minimize size (/O1) instead of favouring speed (/O2 or higher) this is almost doubled to 43 iterations. Another thing: uPy built with mingw gives me 136 iterations. However adding a Conclusion: still no idea what causes the actual differences, but multiple factors are at play here. |
The way I usually analyse the change in stack usage for any given patch is to look at the disassembly of the function in question, look at the prelude to see how much stack is being allocated. So if you really want to spend some time on this then you can do just that: for all the C functions that participate in a recursive Python function call you'd disassemble them (on both compilers you're comparing) and work out the stack use for each function. |
@ARF1 I went over the compiler switches again and just disabling /GL (whole program optimization, will allow scenarios like inlining of functions defined in other translation units ect) gives 52 iterations (adding /O1 on top of that only adds one extra iteration). So that is definitely the switch which has the most influence on stack usage. The trade-off is of course speed: it does produce code which is about 20% slower (only measured for call-heavy code like the sample code). |
Anything can be done here or this can be closed? |
Well I laid out the possible workarounds (loosen check, disable check or play with compiler flags) including some numbers, advantages and disadvantages. As far as I'm concerned, I think this should be done in an 'on demand' way (meaning anyone experiencing problems can refer to this issue and pick a suitable workaround if needed) and not as some general solution applied to uPy (see last comment: trading of 20% of performance can be quite the deal). So +1 for closing this. |
@pfalcon I probably should have been clearer in my last posts on this issue: From my side, this can be closed. Stinos has even worked out beautifully where this oddity originates. Thank you for this! It might be worth considering whether his findings are something that deserves a short heads-up in the docs to avoid loosing the knowledge. |
@ARF1: Thanks, closing.
There's https://github.com/micropython/micropython/blob/master/windows/README.md , feel free to authorize together with @stinos a concise description of the issue, perhaps linking to this ticket. |
Add information as discussed in micropython#2927 to the readme to make the easier to discover.
Add information as discussed in micropython#2927 to the readme to make the easier to discover.
…n-master Translations update from Weblate
I ran into a
Maximum recursion depth exceeded
error while developing on micropython on Windows. I was somewhat surprised as I did not think that my code was excessively complex.I used the code from this forum post to test the recursion depth:
It seems the recursion depth is 12 on Windows? Can that be right? How is it controlled? The same forum post mentions that on the Unix build, the recursion depth is 166. Even on the esp8266 it seems to be 19.
The text was updated successfully, but these errors were encountered: