Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] PID Autotune in combination with hybrid USB causes lockup #19187

Closed
swilkens opened this issue Aug 29, 2020 · 21 comments
Closed

[BUG] PID Autotune in combination with hybrid USB causes lockup #19187

swilkens opened this issue Aug 29, 2020 · 21 comments

Comments

@swilkens
Copy link
Contributor

swilkens commented Aug 29, 2020

Update

See #19187 (comment) for new insights

For now, it seems the lockups with the PID autotune are related to the use of the Hybrid USB mode.


Bug Description

Using Marlin 2.0.6.1, initiating PID autotune over a serial connection will lock up the control board until said serial connection is interrupted (USB cable removed), resulting in thermal protection (constant beeping)

Considering the only commit after 2.0.6.1 at the moment is a date change, this can be considered current bugfix.

My Configurations

2.0.6.1.sw.zip

STM32F103RC_btt_512K_USB

Steps to Reproduce

  1. M303 C3 S200 over Serial
  2. Watch temperature on LCD increase
  3. Notice LCD interface locks up
  4. Remove UBS cable from computer
  5. Notice LCD interface resumes as expected

If the USB cable is left attached (serial connection left active), the printer will eventually go into thermal protection.

Additional Information

LCD interface will eventually lock up as well, regardless of serial connectivity. Will stop responding to button input and stop updating temperature values.

Nothing is fed back trough the serial connection;

M303 E0 S200 C3
SENDING:M303 E0 S200 C3
PID Autotune start

And nothing follows.

Confirmed to also happen on 2.0.6.0 by user Skorpi on Discord.

Possibly related

#19148
#19186

@TheNitek
Copy link
Contributor

TheNitek commented Aug 29, 2020

Might even be related to #19103 und #19148 ?

@swilkens
Copy link
Contributor Author

swilkens commented Aug 29, 2020

Going through old binaries I compiled (at that date). I always do a M502 followed by a M500 before testing.

Looks like this has been going on for a while, but it's difficult for me to accept this issue has been ongoing for so long without anybody noticing...

It's also strange to me that this only happens with the PID autotune, other functions seem to work perfectly fine.

Marlin-bugfix-2.0.x-2020-07-28 Same problem
Marlin-bugfix-2.0.x-2020-07-25 Same problem
Marlin-bugfix-2.0.x-2020-06-15 Same problem
Marlin-bugfix-2.0.x-2020-05-24 Same problem

Marlin 2.0.5.3 2020-03-31 Same problem

Looking through merged commits relating to PID autotune now, there can't be that many.

@swilkens
Copy link
Contributor Author

Changed default_envs from STM32F103RC_btt_512K_USB to STM32F103RC_btt and the problem goes away. Maybe this is caused by the hybrid interface.

@swilkens
Copy link
Contributor Author

swilkens commented Aug 29, 2020

So it does look like the hybrid interface has some interaction here, though it could be that the driver on the PC itself is involved?

For now, with 2.0.6.1, this is the state;

STM32F103RC_btt No Problem
STM32F103RC_btt_USB Printer Hangs
STM32F103RC_btt_512K No Problem
STM32F103RC_btt_512K_USB Printer Hangs

Changing issue title.

@swilkens swilkens changed the title [BUG] PID Autotune locks up device unless serial is interrupted [BUG] PID Autotune in combination with hybrid USB causes lockup Aug 29, 2020
@thisiskeithb
Copy link
Member

Have you tried STM32F103RC_btt_USB?

@swilkens
Copy link
Contributor Author

swilkens commented Aug 29, 2020

Have you tried STM32F103RC_btt_USB?

I have now, and that fails as well.

That also means that this fails with *_USB all the way back to Marlin 2.0.5.3 2020-03-31 which leads me to believe this may still be a driver issue on the PC. I'll try to re-init.

Update: Driver update didn't change the situation.

@swilkens
Copy link
Contributor Author

swilkens commented Sep 1, 2020

Is anybody with similar hardware able to confirm this one? SKR Mini E3 V1.0 / 1.1 / 1.2 should do. Perhaps even 2.0.

@thisiskeithb
Copy link
Member

I tried running a PID autotune from OctoPrint with an SKR Mini E3 V2.0 running the latest bugfix-2.0.x (2979da7) and results were a bit different:

  • STM32F103RC_btt_512K_USB: OctoPrint disconnects part way through the PID tune with a "SerialException: 'device reports readiness to read but returned no data (device disconnected or multiple access on port?)" error and I'm unable to connect again until the PID tune is complete.
  • STM32F103RC_btt_512K: No disconnects; PID tune completes as expected.

@swilkens
Copy link
Contributor Author

swilkens commented Sep 2, 2020

I see the same, rather my board (V1.2) will end up going into short beeps followed by a long beep - which I believe is thermal protection?

I also notice the LCD become static and unresponsive to button inputs, can you confirm? (if you still have a screen attached)

@github-actions
Copy link

github-actions bot commented Oct 3, 2020

This issue has had no activity in the last 30 days. Please add a reply if you want to keep this issue active, otherwise it will be automatically closed within 7 days.

@userosos
Copy link

userosos commented Oct 3, 2020

I has the error also on an skr mini v1.1. I use repeiter-serever if i send to the motherboard - M303 E-1 S70 C8 from repeiter's console. I can see in console:

Recv:16:34:59.765: PID Autotune start
Mesg:16:35:29.346: Connection closed by os.
Mesg:16:35:30.349: Dtr: true Rts: true
Mesg:16:35:30.350: Connection started
Mesg:16:35:30.350: Dtr: false Rts: false
Mesg:16:35:41.489: Dtr: true Rts: true (2)
Mesg:16:35:41.489: Connection started
Mesg:16:35:41.489: Dtr: false Rts: false
Mesg:16:35:52.724: Dtr: true Rts: true (2)
Mesg:16:35:52.724: Connection started
Mesg:16:35:52.725: Dtr: false Rts: false
Mesg:16:35:52.745: Dtr: true Rts: true
Mesg:16:36:00.064: Connection closed by os.
Mesg:16:36:01.366: Dtr: true Rts: true
Mesg:16:36:01.367: Connection started
Mesg:16:36:01.367: Dtr: false Rts: false
Mesg:16:36:12.801: Dtr: true Rts: true (2)
Mesg:16:36:12.801: Connection started
Mesg:16:36:12.802: Dtr: false Rts: false
Mesg:16:36:24.335: Dtr: true Rts: true (2)
Mesg:16:36:24.336: Connection started
Mesg:16:36:24.336: Dtr: false Rts: false
Mesg:16:36:24.356: Dtr: true Rts: true
Mesg:16:36:30.785: Connection closed by os.
Mesg:16:36:32.387: Dtr: true Rts: true
Mesg:16:36:32.388: Connection started
Mesg:16:36:32.389: Dtr: false Rts: false
Mesg:16:36:44.131: Dtr: true Rts: true (2)
Mesg:16:36:44.131: Connection started
Mesg:16:36:44.131: Dtr: false Rts: false
Mesg:16:36:55.970: Dtr: true Rts: true (2)
Mesg:16:36:55.970: Connection started
Mesg:16:36:55.971: Dtr: false Rts: false
Mesg:16:36:55.991: Dtr: true Rts: true
Mesg:16:37:01.502: Connection closed by os.
Mesg:16:37:03.404: Dtr: true Rts: true
Mesg:16:37:03.405: Connection started
Mesg:16:37:03.405: Dtr: false Rts: false
Mesg:16:37:15.445: Dtr: true Rts: true (2)
Mesg:16:37:15.445: Connection started
Mesg:16:37:15.449: Dtr: false Rts: false
Mesg:16:37:26.483: Dtr: true Rts: true (2)
Mesg:16:37:26.484: Connection started
Mesg:16:37:26.484: Dtr: false Rts: false
Mesg:16:38:07.862: Dtr: true Rts: true (2)
Mesg:16:38:07.863: Connection started
Mesg:16:38:07.863: Dtr: false Rts: false
Mesg:16:38:49.560: Dtr: true Rts: true (2)
Mesg:16:38:49.561: Connection started
Mesg:16:38:49.561: Dtr: false Rts: false
Mesg:16:39:31.261: Dtr: true Rts: true (2)
Mesg:16:39:31.261: Connection started
Mesg:16:39:31.261: Dtr: false Rts: false
Mesg:16:40:12.961: Dtr: true Rts: true (2)
Mesg:16:40:12.962: Connection started
Mesg:16:40:12.963: Dtr: false Rts: false
Mesg:16:40:12.983: Dtr: true Rts: true
Recv:16:40:16.518: bias: 39 d: 39 min: 68.67 max: 70.00 Ku: 74.64 Tu: 28.38
Recv:16:40:16.520: No overshoot
Recv:16:40:16.526: Response while unconnected: Kp: 14.93 Ki: 1.05 Kd: 141.21
Recv:16:40:47.465: bias: 33 d: 33 min: 69.39 max: 70.00 Ku: 136.66 Tu: 30.95
Recv:16:40:47.467: No overshoot
Mesg:16:41:18.490: Warning: Communication timeout - resetting communication buffer.
Mesg:16:41:18.490: Connection status: Buffered:101, Manual Commands: 5, Job Commands: 0
Mesg:16:41:18.490: Buffer used:101 Enforced free byte:10 lines stored:8
Recv:16:41:20.174: bias: 37 d: 37 min: 68.71 max: 70.07 Ku: 69.24 Tu: 32.71
Recv:16:41:20.177: No overshoot
Recv:16:41:20.201: PID Autotune finished! Put the last Kp, Ki and Kd constants from below into Configuration.h
Recv:16:41:20.206: #define DEFAULT_bedKp 13.85
Recv:16:41:20.212: #define DEFAULT_bedKi 0.85
Recv:16:41:20.217: #define DEFAULT_bedKd 150.99
Recv:16:41:20.230: Error:Line Number is not Last Line Number+1, Last Line: 11
Recv:16:41:20.234: Resend: 12
Recv:16:41:20.252: echo:Unknown command: "T"
Recv:16:41:20.264: Error:Line Number is not Last Line Number+1, Last Line: 11
Recv:16:41:20.268: Resend: 12
Mesg:16:41:20.302: Connection closed by os.
Mesg:16:41:21.305: Dtr: true Rts: true
Mesg:16:41:21.305: Connection started
Mesg:16:41:21.310: Dtr: false Rts: false
Mesg:16:41:21.331: Dtr: true Rts: true
Recv:16:41:21.337: echo:Unknown command: "T"
Recv:16:41:21.337: Response while unconnected:ok
Recv:16:41:21.408: FIRMWARE_NAME:Marlin 2.0.6.1 (Sep 26 2020 18:28:31) SOURCE_CODE_URL:https://github.com/MarlinFirmware/Marlin PROTOCOL_VERSION:1.0 MACHINE_TYPE:Sprinter 233 EXTRUDER_COUNT:1 UUID:cede2a2f-41a2-4748-9b12-c55c62f367ff

@swilkens
Copy link
Contributor Author

swilkens commented Oct 3, 2020

@userosos @thisiskeithb I'm told recent bugfix has some serial fixes, I'm not in a position to try this at the moment - but you might.

@thisiskeithb
Copy link
Member

For people reporting on this issue, please make sure you're running at least 2.0.7 or the latest bugfix-2.0.x.

@sjasonsmith
Copy link
Contributor

The recent fixes for serial hangs impact any STM32F1 board which uses a hardware serial port for anything. For many board this includes the built-in USB Serial. It also includes serial connected displays and TMC stepper drivers.

@userosos
Copy link

userosos commented Oct 4, 2020

I updated the firmware to 2.0.7 and if i use M303 E-1 S80 C10 - i has no disconnect and it work OK.
But if i use M303 E0 S200 C8 i has disconnected from the motherboard. After heating up the hotend the repetier-server have connected to the motherboard and work ok.

@swilkens
Copy link
Contributor Author

swilkens commented Oct 6, 2020

I have verified with 2.0.7 on my STM32F103RC_btt_512K_USB environment and it still locks up the LCD with no reaction to the encoder button and no LCD updates.

Serial communication also locks up.

@thisiskeithb
Copy link
Member

I'm not getting LCD lockups, but using the *_USB environments on an SKR Mini E3 V2 still causes the host to disconnect when running a PID autotune on either the hotend or bed using the latest bugfix-2.0.x.

@rhapsodyv
Copy link
Member

@swilkens can you test this PR? #19671

Thanks!

@thisiskeithb
Copy link
Member

can you test this PR? #19671

This fixed the serial disconnects I was seeing on an SKR Mini E3 V2.0 compiled with the _USB environments, but I can't speak to the LCD lockup issue since I wasn't seeing that on my builds.

@swilkens
Copy link
Contributor Author

@swilkens can you test this PR? #19671

Thanks!

Eureka!

I have compiled and tested 2.0.7.2 at 8e1ea6a and have found that PID autotune when called over serial now fully completes.

Serial connection remains active, LCD (which is serially connected) no longer locks up.

@userosos can confirm?

I am locking this issue as fixed, thanks @rhapsodyv

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants