You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A memcpy with a constant length is lowered to a (fast) sequence of load and store instructions. A memcpy with a non-constant length is lowered to a call to the memcpy function, which is slow for short copies.
For example, a memcpy of a zext i1 is equivalent to a conditional load and store of a single byte, but the generated IR (and ASM) contains a call to memcpy:
The general transformation here is turning "memcpy(a, b, cond ? c : d);" into "if (cond) memcpy(a, b, c); else memcpy(a, b, d);". The hard part is figuring out when it's profitable; this transform just bloats the code unless it simplifies somehow. Ways it can simplify:
If c or d is zero, one of the memcpys goes away.
If c or d is constant, alias analysis becomes more accurate (this is more papering over a weakness than an actual benefit, though)
If c or d is constant, we can potentially hoist loads across one or both paths.
If c or d is a small constant, we can inline one or both memcpys.
If a or b is an alloca, and c and d are constant, we can potentially unblock SROA.
Maybe we could perform this transform in memcpyopt? Or we could try to do something very late, in CodeGenPrepare, just to allow inlining the memcpy. It's hard to gauge what's appropriate because this sort of code is very rare in C and C++, as far as I know.
Extended Description
A memcpy with a constant length is lowered to a (fast) sequence of load and store instructions. A memcpy with a non-constant length is lowered to a call to the memcpy function, which is slow for short copies.
For example, a memcpy of a
zext i1
is equivalent to a conditional load and store of a single byte, but the generated IR (and ASM) contains a call to memcpy:This causes slowness in Rust's Cursor::read, which we discovered in PR rust-lang/rust#37573..
The text was updated successfully, but these errors were encountered: