Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] GCC bug messing with Corner Leveling #21358

Closed
davidtgbe opened this issue Mar 15, 2021 · 18 comments
Closed

[BUG] GCC bug messing with Corner Leveling #21358

davidtgbe opened this issue Mar 15, 2021 · 18 comments

Comments

@davidtgbe
Copy link
Contributor

davidtgbe commented Mar 15, 2021

Did you test the latest bugfix-2.0.x code?

Yes, and the problem still exists.

Bug Description

LEVEL_BED_CORNERS + LEVEL_CORNERS_USE_PROBE not working properly on RELEASE versions.

Bug Timeline

No response

Expected behavior

A correct positioning of the probe in each corner, taking into account the probe_offset, both in RELEASE mode and in DEBUG mode. In my case, the first coordinate of the corner should be @ (10 - (- 36), 10 - (- 8) = (46,18).

Actual behavior

When using LEVEL_BED_CORNERS + LEVEL_CORNERS_USE_PROBE using a RELEASE compiled version, probe offset is not applying correctly. 1st corner coordinate is @(10,10).

When using the DEBUG compiled version, it works as expected.

It would be due to some variable not decalred correctly / compiler optimizations.

Steps to Reproduce

Having this config:

BED LIMITS:

X_MAX_BED = 220
Y_MAX_BED = 220
X_MIN_BED = 0
Y_MIN_BED = 0

TRAVEL LIMITS:

X_MAX_POS = 270
Y_MAX_POS = 220
X_MIN_POS = -1
Y_MIN_POS = -11

PROBE ON THE FRONT-LEFT, RELATIVE TO NOZZLE:

NOZZLE_TO_PROBE_OFFSET { -36, -8, -1.94 }

PROBING_MARGIN = 10

LEVEL_BED_CORNERS
LEVEL_CORNERS_INSET_LFRB { 10, 10, 10, 10 }
LEVEL_CORNERS_USE_PROBE

Building the RELEASE version and leveling the corners of the bed with a probe (BLTOCUH in my case), the first corner (FL) positions the nozzle @10,10, so the probe falls out of the bed. It is clear that the probe compensation has not been applied correctly even though the logic of the code seems to be fine.

Levelling debug log related to 1st corner:

<<< do_blocking_move_to  X10.00 Y10.00 Z5.90
BLTouch DEPLOY requested
BLTouch Command :10
bltouch.deploy_proc() end
do_blocking_move_to  X10.00 Y10.00 Z5.90
>  X10.00 Y10.00 Z1.80
<<< do_blocking_move_to  X10.00 Y10.00 Z1.80
>>> do_blocking_move_to  X10.00 Y10.00 Z1.80
>  X10.00 Y10.00 Z1.90
<<< do_blocking_move_to  X10.00 Y10.00 Z1.90

Version of Marlin Firmware

Latest bugfix (15/4/2021)

Printer model

ET4

Electronics

STM32F407

Add-ons

No response

Your Slicer

No response

Host Software

No response

As a hint, I have debugged code on Marlin Simulator and it is working right... configuration is almost identical to my real machine but with some differences, like BLTOUCH, which I thinks is not currently emulated.

Captura

@davidtgbe davidtgbe changed the title [BUG] (bug summary) LEVEL_CORNERS_INSET_LFRB using absolute coordinates when using LEVEL_BED_CORNERS + LEVEL_CORNERS_USE_PROBE Mar 15, 2021
@davidtgbe davidtgbe changed the title LEVEL_CORNERS_INSET_LFRB using absolute coordinates when using LEVEL_BED_CORNERS + LEVEL_CORNERS_USE_PROBE LEVEL_BED_CORNERS + LEVEL_CORNERS_USE_PROBE not working properly on RELEASE versions. Mar 17, 2021
@davidtgbe
Copy link
Contributor Author

davidtgbe commented Mar 17, 2021

I just realized while debugging that it may be a compiler optimization problem. I attach some debug screenshots of the real machine.

image

image

@Patron54
Copy link

Have the same problem + mesh bed leveling (manual) also ignoring offsets. Using Marlin 2.0.7.2, did'n tried bugfix version.

@thinkyhead
Copy link
Member

Hmm, I see the values are set and do exist…. Interesting.

I notice that the level corner insets are too small for your probe to reach. Maybe there is a maths issue there. Since your X_MAX_POS is the same as your X_MAX_BED try setting the LEVEL_CORNERS_INSET_LFRB to something like { 10, 10, 36, 10 } and see if it gives any better result.

@davidtgbe
Copy link
Contributor Author

@thinkyhead X_MAX_POS = 270. Probe can reach X_MAX_BED = 220 minus INSET (10). The key here is the way code is compiled. With the same configuration and code: debug works fine but release not.

@thinkyhead
Copy link
Member

thinkyhead commented Mar 23, 2021

Yes, I guess that confirms compiler strangeness! I wonder if this would produce a different compiled output…

- current_position -= probe.offset_xy;                // Account for probe offsets
+ current_position.x -= probe.offset_xy.x;
+ current_position.y -= probe.offset_xy.y;

@thinkyhead
Copy link
Member

While I've got you on the line… I'm trying to get OpenBLT set up for a new (STM32F4RTG) board and I have no idea what I'm doing! I started from the Olimex example but the "hooks" seem tailored to that specific board.

Do you suppose your Anet ET4 version of OpenBLT would be a better starting-point for this board (which has a typical 3D-printer pin layout)? So far, after many debugger traces, all I know is that an "Olimex-specific" build of the bootloader running in this new board gets through a bunch of setup —probably wrong for the hardware— but it never hands control over to the firmware.

@davidtgbe
Copy link
Contributor Author

I took https://github.com/feaser/openblt/tree/master/Target/Source/ARMCM4_STM32F4 as start point IIRC. I had no mayor issues configuring OpenBLT (USART #, GPIOs for DFU, hooks, vector table offset etc. You could take a look to commits on my fork to get a view of what is needed to be confgured and where to do it.
Regarding the bug, I will try it when I have a while. Your solution is likely to work, but I think there is a mayor issue there with the compiler optimizations / inlined -= overloaded operand / structures or whatever.

@thinkyhead
Copy link
Member

thinkyhead commented Mar 24, 2021

Thanks for the OpenBLT guidance. I will keep working on it, if only to get more valuable experience with remote debugging….

Meanwhile, regarding this issue I'll take a very close look at the compiler output and also see whether the Godbolt Compiler Explorer versions of the ARM compiler give a similar result. Maybe there is some ambiguity or hole in my operator overrides, or maybe it really is a compiler bug. Only more messing around will tell…!

@davidtgbe
Copy link
Contributor Author

Yes, I guess that confirms compiler strangeness! I wonder if this would produce a different compiled output…

- current_position -= probe.offset_xy;                // Account for probe offsets
+ current_position.x -= probe.offset_xy.x;
+ current_position.y -= probe.offset_xy.y;

I have tested this patch. I really thought it would work, but I have achived the same result. Debug build works fine, realese does not.

@davidtgbe
Copy link
Contributor Author

While a better solution is found...

[PATCH] Disable compiler optimizations on _lcd_test_corners function


index 4cfb4e411b..f94bdf9ab7 100644
--- a/Marlin/src/lcd/menu/menu_bed_corners.cpp
+++ b/Marlin/src/lcd/menu/menu_bed_corners.cpp
@@ -260,6 +260,8 @@ static inline void _lcd_level_bed_corners_get_next_position() {
     return (probe_triggered);
   }
 
+  #pragma GCC push_options
+  #pragma GCC optimize ("-O0")
   void _lcd_test_corners() {
     bed_corner = TERN(LEVEL_CENTER_TOO, center_index, 0);
     last_z = LEVEL_CORNERS_HEIGHT;
@@ -305,6 +307,7 @@ static inline void _lcd_level_bed_corners_get_next_position() {
     ui.goto_screen(_lcd_draw_level_prompt); // prompt for bed leveling
     ui.set_selection(true);
   }
+#pragma GCC pop_options
 
 #else // !LEVEL_CORNERS_USE_PROBE

@thinkyhead
Copy link
Member

thinkyhead commented Apr 16, 2021

As long as we're experimenting, see if this change makes any difference…

- static inline void _lcd_level_bed_corners_get_next_position() {
+ static void _lcd_level_bed_corners_get_next_position() {

In fact, might as well remove all the inline function specifiers.

@thinkyhead thinkyhead changed the title LEVEL_BED_CORNERS + LEVEL_CORNERS_USE_PROBE not working properly on RELEASE versions. [BUG] GCC bug messing with Corner Leveling Apr 16, 2021
@rhapsodyv
Copy link
Member

Try build without this flag: -fmerge-constants

It's added in the common section for all.

@davidtgbe
Copy link
Contributor Author

@rhapsodyv you got it!
I've added a build_unflag to my environment and it is working.

build_unflags        = ${common_stm32.build_unflags} -fmerge-constants

@rhapsodyv
Copy link
Member

I already saw it months ago and we dropped -fmerge-constants-all... but looks like -fmerge-constants is also buggy.

I did a PR to fix for all: #21711

Better remove that buggy flag to avoid random and hard to track issues.

@thinkyhead
Copy link
Member

thinkyhead commented Apr 26, 2021

-fmerge-constants
Attempt to merge identical constants (string constants
and floating-point constants) across compilation units.

This option is the default for optimized compilation if
the assembler and linker support it. Use -fno-merge-constants
to inhibit this behavior.

Enabled at levels -O, -O2, -O3, -Os.

We do want all our strings to get merged across compilation units, if possible. But I just did a compile of the first mega2560 test, and the flag made no difference in build size. As mentioned, its behavior is still enabled with certain compile optimization levels, but maybe the explicit flag is having extra effects. My guess is that this bug exists only in certain versions and builds of GCC. Maybe it's a known issue — I'll look into it.

Anyway… based on my quick test the flag seems to be extraneous anyway, so it can go. But if too-big build size (due to redundant strings) comes up as an issue again in future we might revisit the option and see if it's still bugging out.

@rhapsodyv
Copy link
Member

I pointed the first issue we had with merge constants, it have a lot of information, including link to gcc issues: #20389

@thinkyhead
Copy link
Member

What is needed now is an additional test for CI: We do a native build of Marlin with Asserts and Tests turned on. Then we run it after the build is completed. The test build of Marlin runs a suite of self-tests, sending itself various G-codes and checking a whole range of variable states for known-good results, and produces as output a checklist with PASS/FAIL labels. We'll also be able to run these tests on real embedded hardware any time to get a report of its particular compiler behavior.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked and limited conversation to collaborators Jun 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants