ImVec2 [] operator aliasing #6272

BayesBug · 2023-03-24T23:59:17Z

Version/Branch of Dear ImGui:

Version: 1.60+
Branch: all

Back-end/Renderer/Compiler/OS

Back-ends: all
Compiler: all
Operating System: all

My Issue/Question:

ImVec2 got its operator[] in ad09396 accessing the float via pointer access to x. This was syntactically changed later on but basically still works the same way now.
Unfotunately this is wrong as:

you are not guaranteed that no padding will occour (probably not a problem in practice)
this vialoates aliasing rules and compilers may see it as an optimization opportunity (see https://community.intel.com/t5/Intel-C-Compiler/ImGui-Compilation-Problem/td-p/1461883/)

Screenshots/Video

N/A

Standalone, minimal, complete and verifiable example:

See the linked Intel community forum thread.

The text was updated successfully, but these errors were encountered:

ocornut · 2023-03-25T11:29:39Z

Could you confirm the result of that last reduced sample code with the Intel compiler?

I am not familiar with aliasing rules well enough to tell if this is actually UB, and what would be an adequate and efficient fix (aka not turning the class into some nightmarish union).

BayesBug · 2023-03-25T12:41:47Z

Yes the reduced example code (basically return (&x)[idx];) exhibits the behaviour. To my undestanding rightfully so as a pointer to a float is used to access beyond its boundaries. Also with large anough alignment there will be padding between x and y also breaking ImVec2.

A save replacement could be:
return idx ? y : x;
It does:

not require changing the structure of ImVec2
not have alignment issues
not have aliasing issues
not impact performance to much (likely just a cmov on the address?)

An efficent replacement could be:
return *reinterpret_cast<float*>(reinterpret_cast<unsigned char*>(&x) + idx * (offsetof(ImVec2, y) - offsetof(ImVec2, x)));
It does:

not require changing the structure of ImVec2
take alignment into account
use the char pointer exemption to some aliasing requirements
allow the compiler to efficently optimize (same machine code for IntelC++ as old version with strict aliasing rules disabled)
I'm by no means an expert on this so this solution might also have issues.

EDIT:
Tried to undestand it a little bit more. The char type in the second version has to be unsigned in any case. But I can still not reassuredly figure out wether this construction is valid. The save efficent method might indeed involve some convoluted memcpy structure as @zao commented.

zao · 2023-03-25T12:42:01Z

I last ran into a production problem with this kind of thing on macOS in the mid 00s with GCC, where in a very niche case my Z component became invalid in a glVertex3fv(&v.x) call.

In my case it was in a very particular type of leaf function where the compiler had statically determined that the only two components it needed to keep around on the stack were X and Y as none of the code had touched Z enough, so OpenGL read unrelated stack contents for the final component.

The standards reading I did at the time which should still hold today is that through a pointer to a member the only subobject reachable is the subobject itself. There is a related rule that you may cast between a pointer to an object and a pointer to the first subobject in a POD type (which I guess is called something else in modern C++) and there's some wording around common initial subsequences, but I do not believe they help here.

In my case when calling an external function with a pointer argument, it sufficed to cast the whole object's storage into a float const*. I was going to cite GLM as a reference on how to do it but apparently their glm::value_ptr is implemented as &v.x, which I guess may or may not be a problem depending on which one of the internal storage choices an user makes there.

Implementation-wise, safe approaches include:

branching with a switch, if or ternary choice to select the correct member;
memcpy (<cstring>) out of the underlying object storage to a temporary, maybe with offsetof (<cstddef>).

The approach of reinterpreting the underlying storage for the whole struct as a float pointer and indexing that is slightly more sketchy depending on how you read the aliasing rules but should be "better" than forming a pointer to a member.

ocornut · 2023-03-28T12:53:06Z

EDIT Sorry I forgot to thank you both for your detailed answers. Thank you!

My main concern is getting this fixed in whichever compiler is not happy about this, rather than respecting some theoretical ruling. We probably a lots more low-level stuff in the codebase which are practical and effectively functional everywhere.

I am concerned with Intel compiler having an issue here.
I am not concerned with possible padding: I believe this would be a theoretical or manufactured at this point.
Calling e.g. memcpy is out of question as this would lead to bloat in typical-debug-build.

EDIT.2 For avoidance of doubt, performances under typical "Debug" builds condition are as important to us as performances in more Optimized build conditions. To be realistic, in this very case, the stuff dragged by the assert macro is likely most costly but I'm interested in this as the operator is very convenient and I'm tempted to use it more.

I quickly compared those 3 versions:
(1)

float  operator[] (size_t idx) const    { IM_ASSERT(idx == 0 || idx == 1); return (&x)[idx]; }  // We very rarely use this [] operator, the assert overhead is fine.
float& operator[] (size_t idx)          { IM_ASSERT(idx == 0 || idx == 1); return (&x)[idx]; }  // We very rarely use this [] operator, the assert overhead is fine.

(2)

float  operator[] (size_t idx) const    { IM_ASSERT(idx == 0 || idx == 1); return idx ? y : x; }  // We very rarely use this [] operator, the assert overhead is fine.
float& operator[] (size_t idx)          { IM_ASSERT(idx == 0 || idx == 1); return idx ? y : x; }  // We very rarely use this [] operator, the assert overhead is fine.

(3)

float  operator[] (size_t idx) const    { IM_ASSERT(idx == 0 || idx == 1); return *(const float*)((const unsigned char*)(&x) + idx * sizeof(float)); }  // We very rarely use this [] operator, the assert overhead is fine.
float& operator[] (size_t idx)          { IM_ASSERT(idx == 0 || idx == 1); return *(float*)((unsigned char*)(&x) + idx * sizeof(float)); }              // We very rarely use this [] operator, the assert overhead is fine.

VS 2015 Debug longer and more branchycodegen for (2) in the operator [] function.
VS 2015 Release. On a simple stress test over all dear imgui I couldn't measure a difference.

Could you confirm that (3) written as-is fixes the issue for Intel Compiler?
What about changing the two unsigned char to char ?
Why did you mention "the char type in the second version has to be unsigned in any case" ? I thought char* would be included in the aliasing exception?

emoon · 2023-03-28T16:32:58Z

I haven't looked at the details here, but isn't an anonymous union what you want? Something like:

class ImVec2 {
  union {
    float a[2];
    float x,y;
  };
  ...
};

ocornut · 2023-03-28T17:34:56Z

That may be cargo-culting but I'm really worried about the side-effect of an union on compilers especially for such a commonly and massively used type (and in contrast the [] operator is quite rarely used, which is why it can afford the assert).

BayesBug · 2023-03-28T17:35:41Z

@ocornut
Yes version 3 does fix the issue for the Intel Compiler and gets fully optimized. (Though i'm still not 100% sure its legit.. but it works)

The unsigned vs. char thingy i must have hallucinated, i could have sworn i saw something about it and then reading about it in this Article had me convinced, but goining over it again its likely a noissue.

@emoon

I haven't looked at the details here, but isn't an anonymous union what you want? Something like:
class ImVec2 {
  union {
    float a[2];
    float x,y;
  };
  ...
};

And now x and y would share the same memory. not ideal ;)
Also there is a lot of UB surrounding unions in C++ (big difference C) especially in this context (see e.g. this Article)

ocornut · 2023-03-28T18:11:52Z

Thank you for confirming.
Out of curiosity, does this variation also works?

float& operator[] (size_t idx)          { IM_ASSERT(idx == 0 || idx == 1); return ((float*)(char*)&x)[idx]; }
float  operator[] (size_t idx) const    { IM_ASSERT(idx == 0 || idx == 1); return ((const float*)(const char*)&x)[idx]; }

BayesBug · 2023-03-28T19:07:23Z

Yes it also works in the limited set (imgui master and reduced code example) I tested BUT the compiler's optimization behavior is unfortunately not as derterministic as one would like. Completely unrelated changes (like in my reduced example on the Intel community forum just adding more output) can throw it off and you will not observe the same behavior. And my guess would still be that all the pointer cast variants here are technically UB.

ocornut · 2023-03-28T19:15:20Z

Completely unrelated changes [...] can throw it off

Do you mean WITH those solutions, or with the current code in master?

BayesBug · 2023-03-28T21:03:41Z

The solutions here worked in all tested configurations.
I was talking about the variant in the current master that does not always trigger "catastrophic optimization" given seemingly unrelated changes in the rest of the program.

emoon · 2023-03-29T06:01:38Z

@BayesBug

And now x and y would share the same memory. not ideal ;)

Doh, typo of course should have been a struct { float x,y } there

Also there is a lot of UB surrounding unions in C++ (big difference C)

Fair enough, I have mostly used them in C

BayesBug · 2023-03-29T08:51:15Z

@emoon
I think that anonymous structs are unfortunately purely a C thing though I guess likely most current C++ compilers would support it in that they support C11?

zao · 2023-03-29T09:32:27Z

float& operator[] (size_t idx)          { IM_ASSERT(idx == 0 || idx == 1); return ((float*)(char*)&x)[idx]; }
float  operator[] (size_t idx) const    { IM_ASSERT(idx == 0 || idx == 1); return ((const float*)(const char*)&x)[idx]; }

I'm feeling a bit uneasy about using a pointer to the first member rather than the object as a whole here as that's what got us into this mess in the first place, but I have nothing concrete here to back this up except my experience so consider this more of an observation.

…ntel C++ compiler. (#6272) Note that this is not BayesBug's exact intended solution, so issues would be my responsibility ;)

…ntel C++ compiler. (#6272) Note that this is not BayesBug's exact intended solution, so issues would be my responsibility ;) Amended.

ocornut · 2023-03-29T10:52:44Z

I'm feeling a bit uneasy about using a pointer to the first member rather than the object as a whole here

Right. I tested with this instead of &x and got same codegen with VS2022 in both Debug and Release in areas I looked at, so I went for this.

I agree this is all a bit uneasy but I currently have to prioritize:

guaranteed performances on all setups
readability (not make it unnecessarily scary C++ if it can be avoided).

So going for that "simpler" solution for now, and in the event we do ever find evidence that there are further real issues we'll react to them, but I'd rather do the strict minimum to solve an existing problem now. Pushed a38e3c2

Thanks a lot for all bringing your expertise with this!

Added by a38e3c2

floooh · 2023-05-20T14:38:24Z

@emoon

And now x and y would share the same memory. not ideal ;)

Doh, typo of course should have been a struct { float x,y } there

Also there is a lot of UB surrounding unions in C++ (big difference C)

Fair enough, I have mostly used them in C

This usage of unions is valid in C, but (technically) undefined behaviour in C++ (but still typically works as 'non-standard language extension' - personally I haven't seen it break so far in C++). From https://en.cppreference.com/w/cpp/language/union:

ocornut changed the title ~~ImVec2 is bugged~~ ImVec2 [] operator aliasing Mar 25, 2023

ocornut added bug building labels Mar 29, 2023

ocornut added a commit that referenced this issue Mar 29, 2023

Fixed ImVec2 operator[] violating aliasing rules causing issue with I…

f63b854

…ntel C++ compiler. (#6272) Note that this is not BayesBug's exact intended solution, so issues would be my responsibility ;)

ocornut added a commit that referenced this issue Mar 29, 2023

Fixed ImVec2 operator[] violating aliasing rules causing issue with I…

a38e3c2

…ntel C++ compiler. (#6272) Note that this is not BayesBug's exact intended solution, so issues would be my responsibility ;) Amended.

ocornut closed this as completed Mar 29, 2023

ocornut added a commit that referenced this issue Apr 4, 2023

Fixed ImVec2 operator[] warning.in Clang. (#6272)

9a1e09e

Added by a38e3c2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImVec2 [] operator aliasing #6272

ImVec2 [] operator aliasing #6272

BayesBug commented Mar 24, 2023

ocornut commented Mar 25, 2023

BayesBug commented Mar 25, 2023 •

edited

Loading

zao commented Mar 25, 2023

ocornut commented Mar 28, 2023 •

edited

Loading

emoon commented Mar 28, 2023 •

edited

Loading

ocornut commented Mar 28, 2023

BayesBug commented Mar 28, 2023 •

edited

Loading

ocornut commented Mar 28, 2023

BayesBug commented Mar 28, 2023

ocornut commented Mar 28, 2023

BayesBug commented Mar 28, 2023

emoon commented Mar 29, 2023 •

edited

Loading

BayesBug commented Mar 29, 2023

zao commented Mar 29, 2023

ocornut commented Mar 29, 2023

floooh commented May 20, 2023

ImVec2 [] operator aliasing #6272

ImVec2 [] operator aliasing #6272

Comments

BayesBug commented Mar 24, 2023

ocornut commented Mar 25, 2023

BayesBug commented Mar 25, 2023 • edited Loading

zao commented Mar 25, 2023

ocornut commented Mar 28, 2023 • edited Loading

emoon commented Mar 28, 2023 • edited Loading

ocornut commented Mar 28, 2023

BayesBug commented Mar 28, 2023 • edited Loading

ocornut commented Mar 28, 2023

BayesBug commented Mar 28, 2023

ocornut commented Mar 28, 2023

BayesBug commented Mar 28, 2023

emoon commented Mar 29, 2023 • edited Loading

BayesBug commented Mar 29, 2023

zao commented Mar 29, 2023

ocornut commented Mar 29, 2023

floooh commented May 20, 2023

BayesBug commented Mar 25, 2023 •

edited

Loading

ocornut commented Mar 28, 2023 •

edited

Loading

emoon commented Mar 28, 2023 •

edited

Loading

BayesBug commented Mar 28, 2023 •

edited

Loading

emoon commented Mar 29, 2023 •

edited

Loading