
Description
Explicit Shift Operators
This proposal suggests making bit-shifting operations explicit and type-agnostic.
But why?
Currently, the behavior of >>
and shrExact
depend on the user knowing the type being worked on.
const warn = @import("std").debug.warn;
pub fn main() void {
const signed = @bitCast(u8, @intCast(i8, 1) << 7 >> 7);
warn("Signed: {0b:0>8}\n", .{signed});
const unsigned = @bitCast(u8, @intCast(u8, 1) << 7 >> 7);
warn("Unsigned: {0b:0>8}\n", .{unsigned});
const signedExact = @bitCast(u8, @shrExact(@intCast(i8, 1) << 7, 7));
warn("SignedExact: {0b:0>8}\n", .{signedExact});
const unsignedExact = @bitCast(u8, @shrExact(@intCast(u8, 1) << 7, 7));
warn("UnsignedExact: {0b:0>8}\n", .{unsignedExact});
}
which outputs:
Signed: 11111111
Unsigned: 00000001
SignedExact: 11111111
UnsignedExact: 00000001
Needing to know the type to know what an operator does is a common argument against operator overloading, so by this logic Zig's current shr's behavior is inconsistent with it's opinions.
Furthermore, this implicit operator overloading makes it difficult to port shift-heavy bitwise algorithms between signed and unsigned types, as the behavior is now completely different. For example, take this conversation between intellectual gentlemen:
<DrJensen> https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=cfb576205954af096255c96e8a9dcc24
<DrJensen> comparing my software division output to the hardware division
output
<DrJensen> "hard scalar" is correct
<DrRichmond> Oh, the easy solution is just promote all your i8s to i16s to do
the division
<DrJensen> "soft scalar" outputs 0 instead of -42 when dividing -128 by 3
<DrRichmond> https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=4e0c5a75f93bbf42890647f78415cfeb
<DrJensen> but where is it overflowing?
<DrJensen> dividend.wrapping_abs() I guess
<DrJensen> oh probably `((dividend & (1 << i)) >> i)`
<DrJensen> which means if I turn it into a u8..
<DrJensen> https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=3cd9a6633c124d1612d0d7400ae22fe9 :D
* DrJensen CLAPS
<DrJensen> ajajajajajaja
<DrRichmond> wut the fug
<DrRichmond> why does the signedness matter? they both have 8 bits
<DrJensen> ofc it does, `<< i >> i` will do a signed rightshift if it's an i8
<DrJensen> which floods with 1's
<DrJensen> converting to u8 will flood 0's
<DrRichmond> what
<DrRichmond> fuck that noise
<DrJensen> mfw DrRichmond is too used to >>>= from all his java programming
<DrRichmond> I just think bit shift should actually operate at the bit level
and not have a different meaning depending on whether your number
is signed or not
<DrJensen> ye maybe
<DrJensen> I don't really care tbh
<DrRichmond> maybe you should because your code would have been correct the
first time if rust didn't break convention here)
<DrJensen> except...it doesn't
<DrJensen> "When shifting an unsigned value, the >> operator in C is a logical
shift. When shifting a signed value, the >> operator is an arithmetic
shift."
<DrJensen> https://stackoverflow.com/questions/7622/are-the-shift-operators-arithmetic-or-logical-in-c
<DrJensen> or technically, "implementation defined" ;)
<DrRichmond> :|
<DrJensen> would you like me to teach you C++, DrRichmond?
<DrJensen> It's okay, I can make time.
<DrMiller> you can make time but can you make game
<DrRichmond> fuckitdood the more you know
The amount of ignorance and confusion in this conversation spans multiple dimensions. We could be smug about knowing arcane rules, but there's clearly something wrong about this simple operator. They almost halved their throughput! There is unnecessary cognitive overhead when working with right-shift.
Ergo! We take (the only?) good idea from a certain other language and break them into separate operators, with some differences.
"We should toss in some more operators, it responded well to that."
In Zig, there are four bitshifting operators.
Left Bitshift: `<<`, `<<=`, `shlExact`, `shlWithOverflow`
Right Bitshift: `>>`, `>>=`, `shrExact`, `shrWithOverflow`
Sticky Left Bitshift: `<<<`, `<<<=`, `shlStickyExact`, `shlStickyWithOverflow`
Sticky Right Bitshift: `>>>`, `>>>=`, `shrStickyExact`, `shrStickyWithOverflow`
<<
and >>
operators are 'zero-flooding', i.e, they move the bit-buffer in a direction and leave 0's in its wake. However, it is sometimes convenient to use the 'sticky' bitshifts, which fills its wake with the value of the leftmost bit (in >>>
) or rightmost bit (in <<<
).
Bitshifts are sometimes used to simulate arithmetic operations using simpler hardware. For example, in some situations, left-shifting a number is equivalent to multiplying it by a power of 2. However, this is only consistent for unsigned integers fixedpoints, as the left-shift may occupy the Two's Complement's MSB:
const a: u8 = 1 << 3; // 8 = 1 * 2^3
const b: i8 = 2 << 5; // 32 = 2 * 2^5
const c: i8 = 64 << 1; // -128 != 64 * 2^1!
Similarly, a right shift may, in some situations, be used to divide by a power of 2, but this is only consistent for unsigned integers fixedpoints, as shifted signed numbers do not follow division's rounding rules:
const a: u8 = 128 >> 1; // 64 = 128 / 2^1
const b_incorrect: i8 = -128 >> 1; // 64 != -128 / 2^1!
const b_correct: i8 = -128 >>> 1; // -64 = -128 / 2^1!
const b_what: i8 = -1 >> 1; // 127 != -1 / 2^1!
const b_ohno: i8 = -1 >>> 1; // -1 != -1 / 2^1
For this reason, manually representing multiplication or division via shifts is a choice one should consider very carefully. Keep in mind the compiler will automatically convert multiplies and divides to shifts whenever it has enough information to do so.
Wiggling, Pulsating Innards
Implementation is fairly straightforward, but some gotchas are likely to arise.
Drawbacks
- Different. As far as the author is aware, no other language has defined shifting like this, although it is arguably less strange than what Java and Javascript define.
- Basically the opposite of Java and Javascript's definitions, so it's likely to knot up caffeinated greybeards' muscle memory.
Rationale and alternatives
C's shifting is defined as follows:
The result of
E1 << E2
isE1
left-shiftedE2
bit positions; vacated bits are filled with zeros. IfE1
has an unsigned type, the value of the result isE1 × 2^E2
, reduced modulo one more than the maximum value representable in the result type. IfE1
has a signed type and nonnegative value, andE1×2E2
is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.
The result of
E1 >> E2
isE1
right-shiftedE2
bit positions. IfE1
has an unsigned type or ifE1
has a signed type and a nonnegative value, the value of the result is the integral part of the quotient ofE1 / 2^E2
. IfE1
has a signed type and a negative value, the resulting value is implementation-defined.
-- http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
This undefined and implementation-defined behavior should be taken in context with the era:
Arithmetic right shifts for negative numbers are equivalent to division using rounding towards 0 in one's complement representation of signed numbers as was used by some historic computers, but this is no longer in general use.
-- https://en.wikipedia.org/wiki/Arithmetic_shift#Non-equivalence_of_arithmetic_right_shift_and_division
So, for one's complement machines, having n << p
and n >> p
be synonymous with n * 2^p
and n / 2^p
is sensible for both signed and unsigned numbers. In two's complement, this is no longer sensible. C's indecisive solution is to essentially allow the hardware engineers to overload the <<
and >>
operators.
By definition, we can not rely on this. Other languages have tried to imitate C's indecisiveness in this matter, allowing language behavior to be defined by historical curiosity and ambivalent waffling.
By defining separate operators for shifts, in terms of zero vs sticky, we sidestep the arithmetic context altogether and treat them exactly as they are: bitwise operators that move and flood. Any arithmetic context is a convenient side-effect.
Alternative Implementation: Chop off its hands
We could have only two bitshift operators, <<
and >>
, and remove sticky shifting altogether, requiring any equivalent functionality to be recreated using ~(mask << a)
and ~(mask >> a)
. However this is less desirable, as the mask created by sticky shifting is conditional on the MSB or LSB of the bit-buffer, requiring a rewrite to involve a if / else
branch, which is much less zen-inducing than <<<
and >>>
Prior art
- The C Standard
- Every other C inspired programming language
Unresolved questions
Does defining shifts in terms of "Flood-zero's" and "Sticky" make users expect a "Flood-one's" operator? Does the lack of one make programming less convenient?
Future possibilities
Shifts have always been in a strange situation, not quite a bitwise operation, not quite an arithmetic one, and weighed down by historic waffling in computing hardware. We are in a position to clarify the purpose of these fundamental tools, decreasing the number of dumb bugs people deal with every day, and increasing their Zen of Zig, one bit at a time.