Skip to content

Conversation

@jmdavis
Copy link
Member

@jmdavis jmdavis commented Jun 6, 2012

These functions make it easier to manipulate buffers of ubytes which are
meant to be parsed or written to as sequences of larger integral types.
They assume that the buffers hold their data in big endian.

I'm constantly needing functions like these when manipulating buffers of bytes, so I thought that they would be valuable additions to Phobos.

These functions make it easier to manipulate buffers of ubytes which are
meant to be parsed or written to as sequences of larger integral types.
They assume that the buffers hold their data in big endian.
@DmitryOlshansky
Copy link
Member

While it's cool that big endian happens to be byte order of TCP netstack, I would like to see little endian byte order for the same interface too (it should be trivial as it's the same as that of machine for x86).

@jmdavis
Copy link
Member Author

jmdavis commented Jun 6, 2012

I suppose that it could be added as another template argument, but I have never seen anything which specifically used little endian for anything which needed to operate on buffers of bytes like this. All such formats end up operating in big endian order, and I would have considered it bad practice for them to do anything else.

@DmitryOlshansky
Copy link
Member

Well going solely by I have never seen is rarely a good idea.
For one thing I never seen binray files that whose fileds are not little endian,
because back in the days of C I loaded whole file in RAM then casted buffer to plain struct pointer.
Recall this oldish C idiom:

struct MyFile
{
int magic, ver;
int length, etc;
SOMETHING data[0];
}
where SOMETHING is some POD struct of ints or floats.

@jmdavis
Copy link
Member Author

jmdavis commented Jun 7, 2012

Every binary format that I have ever dealt with has used big endian exclusively (or avoided entirely via other encoding schemes - e.g. H.264 uses exp-golomb encoding to do that), but I've also never dealt with a binary format where you wrote arbitrary types to it like you seem to be talking about. In every case, it's been sequences of bits or bytes which are read in as integers or chars of varying sizes.

In any case, I've now added the ability to indicate the endianness of the bytes in the range. It defaults to big endian.

@alexrp
Copy link
Contributor

alexrp commented Jun 7, 2012

For instance, the Common Intermediate Language format (which is what the Common Language Runtime uses to store program IL in) is almost exclusively little endian, so I think that support is important.

@jmdavis
Copy link
Member Author

jmdavis commented Jun 7, 2012

I tend to think that little endian should be treated like leprosy, but as D is a systems language, and the most prevalent CPU is unfortunately little endian, we don't have much choice in the general case. I just didn't see much point with these functions. But I've added it.

@alexrp
Copy link
Contributor

alexrp commented Jun 7, 2012

Reviewed, and the API seems good and workable-with to me.

@dnadlinger
Copy link
Contributor

@jmdavis: I can confirm that little-endian formats are not as uncommon as you seem to imply – the first examples which come to my mind are Thrift (e.g. the Compact protocol) and Protocol Buffers.

The API seems to be okay – the use of pointers is justifiable here, I suppose. You might want to add at least an assertion checking for the pointer arguments to be non-null, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to add a void initializer here for performance reasons – peek and get are likely to be called in performance-sensitive I/O code. In theory, the optimizer could do this anyway, but I'm not sure if it is actually likely to happen with the ref foreach below.

@jmdavis
Copy link
Member Author

jmdavis commented Jun 7, 2012

Pointers are needed as long as they're overloaded functions, since there's no way to indicate that you want the ref overload at the call point. And it seems silly to me to create a new function name just to avoid using pointers. It's perfectly @safe too, since pointers are just as @safe as references are. It's stuff like pointer arithmetic which isn't.

@dnadlinger
Copy link
Contributor

Oh, darn it, with the void initializer in place, you'd have to add @trusted, but you can't because you don't know whether the range is @safe

@jmdavis
Copy link
Member Author

jmdavis commented Jun 7, 2012

Hmmm. It would be a poor range indeed which couldn't be trusted to have front or popFront called on it. I could add a template constraint to check for it and even overload the function for that case, but it seems like overkill to me. So, the choices would be

  1. Mark it as @trusted and assume that front and popFront are safe to call (which would be fairly shocking if it weren't true).
  2. Remove the = void.

#2 would be safe but less efficient, whereas #1 could be less safe but is probably just fine. So, I don't know. Another thing to consider is the fact that it's only an issue when the range isn't slliceable, and there's a good chance that the code which really cares about efficiency would be using arrays anyway.

@alexrp
Copy link
Contributor

alexrp commented Jun 7, 2012

Really, just go with (2). I'd be surprised if this actually matters in practice. There'd probably be a measurable difference, but not one that's worth the code explosion.

@jmdavis
Copy link
Member Author

jmdavis commented Jun 7, 2012

Okay. The last commit has now been rebased so that it's just the added check on the indices.

jmdavis added a commit that referenced this pull request Jun 8, 2012
Added peek, read, write, and append to std.bitmanip.
@jmdavis jmdavis merged commit ae5e758 into dlang:master Jun 8, 2012
@jmdavis
Copy link
Member Author

jmdavis commented Jun 8, 2012

Since multiple people seem to be fine with this, merging...

@dnadlinger
Copy link
Contributor

@blackwhale: Seems like we really need PApply for templates in Phobos… ;)

@DmitryOlshansky
Copy link
Member

@klickverbot std.meta to the rescue! Oh wait... where is it? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants