buffer: add indexOf() method #561

trevnorris · 2015-01-22T23:47:05Z

Add Buffer#indexOf(). Support strings, numbers and other Buffers. This
is more support than String#indexOf() gives, but the increased
versatility should be helpful.

Special thanks to Sam Rijs for first proposing this
change.

R=@bnoordhuis

This is a re-hash of #160. Done this way so future support can include regexp's and arrays.

trevnorris · 2015-01-22T23:52:16Z

Ref to original PR that brought this about: #160

rvagg · 2015-01-23T00:15:32Z

lib/buffer.js

+Buffer.prototype.indexOf = function indexOf(val, byteOffset) {
+  if (byteOffset > 0xffffffff)
+    byteOffset = 0xffffffff;
+  if (byteOffset < 0)


maybe else if, does that have any optimisation benefit here?

Good point. Will do.

chrisdickinson · 2015-01-23T00:32:27Z

src/node_buffer.cc

+    return args.GetReturnValue().Set(-1);
+
+  char data[2];
+  data[0] = val;


We silently truncate values >255?

Am I supposed to throw?

We already silently truncate values >255 when initializing a buffer:

new Buffer([256]) // <Buffer 00>

Mostly curious if we'd want to treat >0xff values as "search for byte sequence," though I suppose that could start to get into endianness issues.

(alternatively, yes, I would consider throwing for the sake of proper input validation.)

We're not going to throw. I was also considering the search for byte sequence. e.g. b.indexOf(0xbada55), and think it is a good solution. Though users will need to know that byte sequences over 0x20000000000000 loose precision. Also, I'm not worried about endianness. The input data is endiannes independent, and it's up to the user to know what the data looks like that they're searching for.

@bnoordhuis What would be the fastest way to convert a number to a byte sequence that can be searched for?

You mean like this?

uint32_t needle = args[1]->Uint32Value(); void* ptr = memmem(haystack, haystack_len, &needle, sizeof(needle));

mscdex · 2015-01-23T00:44:07Z

src/node_buffer.cc

+void IndexOfNumber(const FunctionCallbackInfo<Value>& args) {
+  ASSERT(args[0]->IsObject());
+  ASSERT(args[1]->IsNumber());
+


Missing ASSERT(args[2]->IsNumber()); ?

Hm. Guess I could put it in. The JS simply does coercion for sanity, but the Uint32Value() will still generally do the correct thing. Guess I'll throw it in.

mscdex · 2015-01-23T00:50:21Z

How would regexp functionality be implemented/supported? The built-in C regex library, v8's regex library, pcre, or?

trevnorris · 2015-01-23T00:51:43Z

@mscdex I'm working on that now. Can hack around it by having a pre-allocated external array class and reassign it on the fly, then convert the regex into 1 byte characters (for utf8 safety). So V8 would still do the heavy lifting but all the resources would still live outside the heap.

rvagg · 2015-01-23T04:17:26Z

see also #161 for lastIndexOf() (listing here for reference, not necessarily recommendation)

bnoordhuis · 2015-01-23T12:36:35Z

src/node_buffer.cc

+
+  if (str.length() == 0) {
+    return args.GetReturnValue().Set(
+        MIN(static_cast<uint32_t>(obj_length), offset));


Suggestion: let's switch to std::min() and std::max() in a follow-up PR.

feross · 2015-01-26T16:52:38Z

@trevnorris Great work!

While considering how to add support in buffer, I realized I don't like how Buffer.prototype.indexOf behaves with a negative byteOffet value.

If byteOffset < 0, we currently search the whole buffer (same as passing 0). This is how String.prototype.indexOf behaves.

I think it makes more sense to treat a byteOffset < 0 as the offset from the end of the buffer. This is how Array.prototype.indexOf and TypedArray.prototype.indexOf do it. In general, I'd prefer to copy the way TypedArray works unless we have a good reason to do otherwise. Here's the spec.

feross · 2015-01-26T17:04:51Z

test/parallel/test-buffer.js

+
+assert.equal(b.indexOf('a'), 0);
+assert.equal(b.indexOf('a', 1), -1);
+assert.equal(b.indexOf('a', -1), 0);


If we changed to Array.prototype.indexOf/TypedArray.prototype.indexOf semantics, this would change to:

assert.equal(b.indexOf('a', -1), -1);

And I'd add a few more tests for good measure:

assert.equal(b.indexOf('a', -4), -1); assert.equal(b.indexOf('a', -5), 0);

trevnorris added buffer Issues and PRs related to the buffer subsystem. enhancement labels Jan 22, 2015

trevnorris force-pushed the buf-indexof-enhancement branch from 52ebeaf to ce9e4b2 Compare January 22, 2015 23:51

rvagg reviewed Jan 23, 2015
View reviewed changes

trevnorris force-pushed the buf-indexof-enhancement branch from ce9e4b2 to e950613 Compare January 23, 2015 00:30

chrisdickinson reviewed Jan 23, 2015
View reviewed changes

trevnorris force-pushed the buf-indexof-enhancement branch from e950613 to f28b793 Compare January 23, 2015 00:44

mscdex reviewed Jan 23, 2015
View reviewed changes

trevnorris force-pushed the buf-indexof-enhancement branch from f28b793 to 0d50c86 Compare January 23, 2015 00:49

rvagg added the semver-minor PRs that contain new features and should be released in the next minor version. label Jan 23, 2015

bnoordhuis reviewed Jan 23, 2015
View reviewed changes

feross reviewed Jan 26, 2015
View reviewed changes

Uh oh!

buffer: add indexOf() method #561

buffer: add indexOf() method #561

Uh oh!

Conversation

trevnorris commented Jan 22, 2015

Uh oh!

trevnorris commented Jan 22, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mscdex commented Jan 23, 2015

Uh oh!

trevnorris commented Jan 23, 2015

Uh oh!

rvagg commented Jan 23, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feross commented Jan 26, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!