-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Printer + Screen freeze mid print, heaters on #18315
Comments
Try the latest bugfix version. |
It might be the same Issue like #18117, however the suggested steps to reproduce their bug don't work for me. |
Taragor quick question. What display are you using? |
I have been using the stock ender 3 display, which is called cr-10 style in Marlin, I think. I just yesterday changed it to BTTs TFT35 E3. Do you think it is related to the display? |
I have been testing for #18117 in the last week. The issue does not appear to be related to the drivers since they do not assert the DIAX and Index pins when the freeze happens. In order to make it appear more often you need to push your printer by setting considerably high acceleration in the begging of your gcode if your slicer does not do that.
With those settings I can consistently crash it in one out of three prints. Now onto the issue I started looking at planner.cpp and something it does called screen throttling. In short this file is the mind behind a print job. If for some reason it sends an update to the screen and the screen does not respond, it will freeze. I have yet to verify how and why that may be the case, but I have compiled marlin without a display, and have been printing from SD using M21 & M24 commands without any issue for two days so far. That is still a bug that we need to figure out, since 99% of the users do use a display with their printer. |
Ok, after testing prints with that high acceleration settings I can confirm that it seems to cause freezes more often. Still all my fails happened on curves, which kind of makes sense, since the times between stepper commands should be way shorter on curves. |
@taragor still an issue? |
Yes, after some further testing it seems @minosg is totally right and this is an UART Race condition, since I couldn't print for longer than 2 hours without freezes using suggested settings (i.e. high acceleration, high speeds etc.) . However after disabling my display and printing headless off the onboard sd I printed without any freezes for 10+ hour using said settings. |
hmmm just guessing but could it be noise and to long display cables? |
@boelle cables shouldn't matter. In my traces I can see the isr firing before it has completed, never retuning in thread mode. The nvic register is also in constant firing state. If a cable was responsible for it would happen by jigging the cable even in slower speeds. This correlation with planner overload and freeze indicates a race condition Also it usually happens in two methods, the idle handler where it tries to update the progress and the host keep alive transmit. Both occasions are an incident of usart landing when the logic is already in exception mode |
I also thought about that, obviously the motors are creating some electric/magnetic fields that could cause glitches in the cables. Yet I'm using the cables that came with my Ender 3 Pro and had no issues while using the stock melzi. EDIT: The thermal runaway might be a one of, I've got a feeling my last firmware.bin might have been corrupted, since I'm seeing all kinds of strange behavior (diagonal movements being executed first x, then y; Random "Unknown command G1", etc...). I'll recompile and reflash and report back. Edit 2: On second thought I might just checkout current bugfix again since I'm now getting strange compiler errors about neopixel library not being found |
OK so I've just compiled marlin from scratch, using new config files, and now the printer his working again. I haven't had a freeze yet, but I'm still printing with only one serial connection to my BTT TFT35 in BTT mode, which is acting like pronterface or any other live print host in this mode, so I don't really expect to see too many freezes that way. |
neopixel library is disabled... in platformio.ini for STM32F103RC_btt baords
I don't know why... probably as they are to much of a load on the processor.. but it does compile if you enable it. |
It probably caused problems at some point and was disabled. If someone actually sets up NeoPixels and verifies it works, they would be welcome to post a Pull Request to change it. |
Neopixels are tricky on this board. This board is timer limited, and the neopixel's pin timer is needed. A recent change changed the Servo8 timer to the same one used by the NeoPixel so that can cause the probe to crash. In the past this was used for tone.cpp generation. It can be made to work it will be in a non standard way and users should be aware of the risks, so I suspect this is why it was disabled at a stage |
I got it to work using this fork of the neopixel lib: https://github.com/ccccmagicboy/Adafruit_NeoPixel. It's the one used in the STM32F103RC_meeb build config. It requires some personalization of the lib though, since it uses marlins delay.h for timing. |
I am running into the same exact issue with the same hardware. What version of the code did you use... bug fix, latest release, development, etc.? Thank you |
@rmangino I'm sorry, I think I was somewhat unclear there. I've got the printer to run again as before so the initial issue persists. I still have crashes with the display enabled in marlin. What I meant was that the constant crashes when even moving only one motor went away. I suspect my firmware.bin was corrupted during copying. It's working stable so far for me with no freezes in 30+ hours of printing. However as soon as i enable display support in marlin I start seeing freezes again. Also using more than just the 2 serial connections (TMC2209 + TFT35) gives me occasional freezes. I believe this issue is caused by the serial race condition discussed in #18358. |
Are you seeing freezes using the TFT35 in the EXP2 port, using the thick header cable? Does this screen work without display support in Marlin enabled? |
@minosg TFT35 is a dual mode display, you can switch between modes by keeping the dial pressed. It has its own MCU (funnily enough it is also an STM32F1) and firmware. The modes are:
When the screen is connected using the EXP-3 connector (Marlin mode, emulating the stock screen), so |
@taragor what you are describing fits my hypothesis about the cause of the issue being timing related. When you compile out cr10 display support the ui.update() logic is faster which is called by Idle() as the planner moves between blocks. Similarly when you are printing with octoprint there it a timer interrupt firing which is the host keep alive message. When using the serial header on the board, I suspect you are printing by injecting gcode commands like sd so the host keep alive should not be triggering (usb cdc) and the ui.Update() is compiled out so the time between critical section transitions in the block buffer is fixed. just to verify we are in the right path you could try one more test. Try disabling host keep alive from config, disable cr10 display (meaning you compile without a marlin display) and print from octoprint. It shouldn't freeze |
@minosg should I disconnect the screen completely for that or should I leave the TFT cable connected? |
If its compiled out it shouldn't matter. This is clearly a software bug |
With the TFT cable connected it will run in BTT mode even with CR-10 support disabled, since it's running like any other serial host (i.e. pronterface) |
@taragor Thank you so much for your response/clarifications. This is the first time I've built Marlin so this is all very new to me. What I know for certain is that my stock Ender 3 Pro never hung in 3 months of very heavy use. With the SKR Mini E3 v2.0 (running the bugfix code) I can't print for more than a few hours without the printer locking up (I'm using Octoprint). I'm still using the stock LCD that came with the printer. The only thing I've added (and enabled in Marlin) is a v3 BLTouch. |
That was exactly my experience too. From what I managed to gather this Issue appears sometimes but then got closed due to a lack of activity. Minosg is the first one who came up with an explanation of what happens but thanks to his work I think this bug can be fixed soon. |
I would not be holding my breath about this bug being fixed soon. It is an extremely nasty bug, which is hard to reproduce. It took us two weeks to figure out exactly how to make it happen reliably. The core reason that it went unnoticed is that uart-based drivers were not that common, and even last years boards used to run tmc2208 in legacy mode. Now with the industry moving on and seeing new boards using uart for drivers you end up having 4 to 6 more interrupts in the system, and absolutely no safeguards or guarantees that the time critical parts of the code are being be respected, you kept on seeing a new bug ticket on something related to it every week. The planner and stepper code are complicated, thousands of lines of code with a high level of physics and math involved in it. my knowledge of Marlin is limited to couple of weeks, and making any changes could affect a lot of existing users. What our best bet is a the moment, is to isolate the specific conditions that trigger this deadlock and hope to fix it by using something something already in the code which is non destructive, like a planner.synchronise() |
Sorry that I am so new to Marlin but I'd like to be sure I'm building the firmware correctly based upon the above suggestions. Btw, I am printing from Octoprint. In Configuration.h I have:
and
Are those changes sufficient? Thank you in advance. |
Yes that should disable keep alive and display. Try to see if that stops the freezes with octoprint |
Ok I've tested live printing of octoprint via USB with |
@targor Planner contains a very time accurate logic at the end of a move. A move is stored as a chain of micro moves in a ring buffer. At the end of each move there is a reset the buffer logic which is a bit sensitive, so if an interrupt fires at this point all hell breaks loose. Which is why you can see it more often when printing curves ( a curve is sliced as series of very small linear moves) You got it working with the screen enabled and host-keep alive disabled? Was that using the default 11500 baud rate? It could just be an instance of not triggering it yet, since you need the interrupts to fire at a very narrow window. Can you try the same setup with Malyan_LCD which is fixed at 50.000 Baud? |
I'd also like to add that I was able to complete two, 12-hour prints successfully. These are my first successful prints since upgrading to the SKR Mini e3 v2.0 + bugfix branch. I'm using a build with display and keep_alives disabled (printing via Octoprint USB). |
Can anyone confirm if they can get away with printing from LCD with disabled host_keep_alive? And if so which LCD driver are you using? |
@minosg I'm not using any display support from marlin ( |
My original firmware config had host_keep_alive_disabled and CR10_STOCKDISPLAY defined. That was the build that locked up every few hours. |
To anyone following this thread I have posted a possible workaround for this issue on #18358 . It involves performing a minor patch your libmaple library. Please test if you would like to confirm the findings of that solution. |
@minosg You are suggesting that we disable |
Nope. Look a bit higher. You need to patch the usart handler in lib maple |
Thank you. For anyone else - here is a direct link to his suggestion. |
This issue is stale because it has been open 30 days with no activity. Remove stale label / comment or this will be closed in 5 days. |
Duplicate of #18358 |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Bug Description
I've had printer freeze randomly during prints since upgrading my Mainboard to SKR mini E3 V2.0 (TMC2209, STM32F1, 24V).
Screen the screen freezes, the printer won't respond to USB commands, the heaters remain on.
I can't say if the PID is still working, or if the heaters are just going full blast/keeping their temp.
I've had 5 failed prints by now, and all of them failed during curved perimeters, so I think it might be related to that.
However that issue occurs only once every ~15-20h. It happened always between 1-5h into the print, retrying the same gcode works.
I can't trigger it deliberately so trying different configurations is pretty slow.
What didn't fix it for me is:
-disabling Linear Advance
-changing jerk values
-printing from SD/USB
-recompiling/reflashing
-It is not temperature related, I've had one failed print ~1h in, first print that day, others failed after 10+ hours.
The only constants through all my failures were:
-Allways (5/5 failed prints) fails during curved perimeters
-Only happens with print speed(cura setting) >=70mm/s (I print most of the time at 70mm/s, however I printed ~20h on 50mm/s and had no fail)
My Configurations
Config.zip
Steps to Reproduce
I can't really reproduce it. Neither printing especially curvy things nor high retraction files (as suggested in #18117) triggers it. However I believe it happens more frequently at higher print speeds.
Expected behavior: [What you expect to happen]
Printer prints file (or online via USB) normally
Actual behavior: [What actually happens]
Printer freezes mid print (both from SD and via USB):
-All movement just stops
-Fans keep spinning
-Screen freezes
-USB becomes unresponsive
-Steppers stay powered
-Heaters remain heated
The text was updated successfully, but these errors were encountered: