Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dgram: improve "address" parameter behavior in Socket.prototype.send #10473

Closed
wants to merge 2 commits into from

Conversation

boneskull
Copy link
Contributor

@boneskull boneskull commented Dec 27, 2016

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

dgram

Description of change

When using Socket.prototype.send, I was confused about when address was required, since the docs are ambiguous at best, and conflicting at worst: they claim address will default to 127.0.0.1, but this is only in the case where callback is not supplied.

My confusion manifest in an exception from the dns module, which was unhelpful:

dns.js:112
    throw new TypeError('Invalid arguments: ' +
    ^

TypeError: Invalid arguments: hostname must be a string or falsey
    at Object.lookup (dns.js:112:11)
    at lookup (dgram.js:27:14)
    at UDP.lookup4 [as lookup] (dgram.js:32:10)
    at Socket.send (dgram.js:362:16)

dns.lookup expects a valid hostname argument, yet this parameter name ("hostname") is not used in Socket.prototype.send. I thought it better to fail with helpful error messaging before getting as far as dns.lookup, which prompted this PR.

The signature of Socket.prototype.send is a bit awkward, and perhaps the best fix here is to change it. I see this PR as being an improvement nonetheless. Description of changes follow.

UPDATE Jan 8 17

Upon suggestion by @mscdex, I've modified the changes to make address optional in all cases.

Summary of changes:

  • In Socket.prototype.send, address is now always optional; it is no longer required to use callback. The signature has changed from this:
    socket.send(msg, [offset, length,] port, address [, callback])
    
    to this:
    socket.send(msg, [offset, length,] port, [, address] [, callback])
    
  • Faster failure and more helpful error messaging in case when address argument is present but invalid; exceptions no longer fall through to dns module.
  • Updates docs, adds tests.

@nodejs-github-bot nodejs-github-bot added the dgram Issues and PRs related to the dgram subsystem / UDP. label Dec 27, 2016
@mscdex
Copy link
Contributor

mscdex commented Dec 27, 2016

I think it would probably be better to just support the 'callback without address' case instead of adding a note/warning in the documentation.

@boneskull
Copy link
Contributor Author

@mscdex I agree. I don't see any technical reason why we couldn't. I didn't want to be presumptuous.

I'll update this PR accordingly.

@boneskull boneskull changed the title dgram: improve Socket docs and invalid arg behavior dgram: improve "address" parameter behavior in Socket.prototype.send Jan 9, 2017
@boneskull
Copy link
Contributor Author

@mscdex I've updated this PR as suggested, rebased, etc. My local tests now pass 100%.

lib/dgram.js Outdated
if (typeof address === 'function') {
callback = address;
address = undefined;
} else if (typeof address !== 'undefined' && typeof address !== 'string') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use address !== undefined instead of typeof address !== 'undefined'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be != undefined?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters in this case since it would just fail the other part of the conditional if it was null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more clear, shouldn't it be if(address != undefined && typeof address != 'string') --- that is, any of the 3 values undefined, null, or String would be acceptable.

Copy link
Contributor

@mscdex mscdex Jan 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by 'acceptable,' but what I was saying is that

address !== undefined && typeof address !== 'string'

and

address != undefined && typeof address !== 'string'

will both evaluate to true when address === null. So the change in strictness of the equality of the first comparison does not matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to change this to:

} else if (address !== undefined && typeof address !== 'string') {

lib/dgram.js Outdated
callback = address;
address = undefined;
} else if (typeof address !== 'undefined' && typeof address !== 'string') {
throw new TypeError('"Address" argument must be a string or undefined');
Copy link
Contributor

@mscdex mscdex Jan 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though it goes against the common error message formatting used in the rest of node core, I think changing this to 'Address must be a string or undefined' might be better since it is more consistent with the error message format when validating the port up above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a closer look at the wording and punctuation. IMO we should be shooting for consistency with core error messaging; not inconsistent error messaging which may be used elsewhere in this module.

Copy link
Contributor

@sam-github sam-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see test cases for combinations of null, undefined, and empty string being passed as addresses, with and without offset+length, and with and without a callback. This is a lot of combinatorics, btw, I've no strong opinions, but perhaps it may be best to write the test as one test file for dgram.send() arg processing?

lib/dgram.js Outdated
if (typeof address === 'function') {
callback = address;
address = undefined;
} else if (typeof address !== 'undefined' && typeof address !== 'string') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be != undefined?

doc/api/dgram.md Outdated
DNS will be used to resolve the address of the host. If the `address` is not
specified or is an empty string, `'127.0.0.1'` or `'::1'` will be used instead.
DNS will be used to resolve the address of the host. Otherwise, if `address` is
not specified or is an empty string, `'127.0.0.1'` (`udp4`) or `'::1'` (`udp6`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "not specified" mean? I think in js we "provide" function arguments, not "specify" them. That aside, I don't understand from the text the behaviour of:

  • send("msg", 7) // "not specified", I think
  • `send("msg", 7, ()=>{}) // also not specified, I think
  • `send("msg", 7, undefined) // specified as undefined, and handled by code below, I think
  • `send("msg", 7, undefined, ()-.{}) // ditto
  • `send("msg", 7, "") // same as if address is undefined
  • send("msg", 7, null) // not same as if address is undefined`? This is weird and inconsistent, if true

I would state that if address is not provided, or is null, undefined, or an empty string, it will use the protocol specific localhost address - and implement and test this condition if it is not already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sam-github I think @mscdex addressed your comments on line 352. Not sure why I'm seeing it twice here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change the wording to "provide".

For an address that is not provided, we defer to the accepted types in dns.lookup(). Per the code there, address is allowed to be a falsy value, or a string.

So, I may have been a bit too stringent in my checks. The allowed values of address are then:

  1. a nonempty string (which is considered "provided"; the rest are not)
  2. false
  3. undefined
  4. null
  5. 0
  6. ''

If it's any other truthy value (except Function; see below), it's invalid, and send() should throw a TypeError.

Furthermore, if address is a Function, then it's the callback, and the address will be reset to a falsy value (I think it was undefined, but it doesn't really matter).

"Falsy" and "truthy" seem to be unique to JS and there's precedence in the Node.js docs for their usage. Explicitly enumerating all "falsy" values is cumbersome, anyway.

I propose the following changes, then:

  1. Avoid ambiguity by stating "If address is a non-empty string, use it to resolve the host. If address is falsy, 127.0.0.1 (etc) will be used instead.`
  2. Update my type check(s) to allow all falsy values to pass through to dns.lookup().
  3. Revisit tests to ensure the "falsy" values don't throw exceptions, and truthy non-strings do throw exceptions.

Sound fair?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boneskull I don't see the usefulness in permitting values like false and 0 and treating them as undefined, if that is what you are suggesting?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think he is saying that is current behaviour, and not doing it would be semver-major. I would say if we are going to change the API, I would expect only allow null and undefined to mean "no value". Btw, from above, it looks like 42 would also be considered a host name? I.e., numbers are allowed? Do they get converted to strings, like '42' and passed to the resolver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boneskull I don't see the usefulness in permitting values like false and 0 and treating them as undefined, if that is what you are suggesting?

This is the current behavior. Values are passed through to dns.lookup(), which allows such things. The documentation for dgram is therefore inaccurate. And ambiguous, which prompted the change.

I think he is saying that is current behaviour, and not doing it would be semver-major. I would say if we are going to change the API, I would expect only allow null and undefined to mean "no value"

This is correct. I'm not proposing making a semver-major change here. Whether or not allowing false is "useful", I don't know, but I'd argue that's not a big enough deal to warrant a semver-major change.

Btw, from above, it looks like 42 would also be considered a host name? I.e., numbers are allowed? Do they get converted to strings, like '42' and passed to the resolver?

No. The only valid number is 0, which is falsy. Then:

If it's any other truthy value (except Function; see below), it's invalid, and send() should throw a TypeError.

@sam-github sam-github added the semver-minor PRs that contain new features and should be released in the next minor version. label Jan 11, 2017
@sam-github
Copy link
Contributor

Btw, labelled as minor because of

Upon suggestion by @mscdex, I've modified the changes to make address optional in all cases.

but perhaps it should be major because of

more helpful error messaging in case when address argument is present but invalid; exceptions no longer fall through to dns module.

And of course if it starts rejecting previously valid input, it must be major, but that's still under discussion, I think.

@boneskull
Copy link
Contributor Author

boneskull commented Jan 11, 2017

more helpful error messaging in case when address argument is present but invalid; exceptions no longer fall through to dns module.
And of course if it starts rejecting previously valid input, it must be major, but that's still under discussion, I think.

It doesn't. Either would throw an error; the errors thrown in this PR are more helpful/obvious.

UPDATE By "fall through" I mean the dns module currently throws some errors here. My changes throw errors in the same situations (in a more helpful manner) before getting as far as the dns module.

@sam-github
Copy link
Contributor

@boneskull Changes in error message text are considered semver-major, because they can break apps that match against err.message. Its an API breakage, but not controversial, better error messages are a good thing. So, sounds like this is semver-major?

@sam-github
Copy link
Contributor

@boneskull FYI, I would have done this as one commit introducing tests, because the test coverage wasn't there over the invalid addresses, asserting the current behaviour, then a follow-on commit that improves the error messages. That would as a side-effect also make whether it is an API change or not crystal clear - because changes to test behaviour are API changes. Its probably too much work to rebase it like that, but a 2-phase approach for code that has no test coverage ATM is easier to review.

@sam-github sam-github added semver-major PRs that contain breaking changes and should be released in the next major version. and removed semver-minor PRs that contain new features and should be released in the next minor version. labels Jan 11, 2017
@boneskull
Copy link
Contributor Author

@boneskull Changes in error message text are considered semver-major, because they can break apps that match against err.message. Its an API breakage, but not controversial, better error messages are a good thing. So, sounds like this is semver-major?

That's an unfortunate--but likely necessary--policy.

I want to make it clear that I'm not interested in expanding this PR to further restrict allowed types or values of the address parameter (e.g., disallowing false). If we go down that road, the same logic suggests we should also change dns.lookup(). Basically, this would be much more effort than I had intended to give this PR.

@boneskull
Copy link
Contributor Author

Its probably too much work to rebase it like that, but a 2-phase approach for code that has no test coverage ATM is easier to review.

This is actually pretty straightforward to do.

@sam-github
Copy link
Contributor

I understand. This PR doesn't change the behaviour, other than making the error messages much more helpful.

The tests and docs do reveal that the current behaviour is not ideal, and now that you have done this work, its easier to discuss how we wish the API would work, and change it in the future.

@boneskull
Copy link
Contributor Author

boneskull commented Jan 11, 2017

I understand. This PR doesn't change the behaviour, other than making the error messages much more helpful.

That's not entirely true either. This PR also allows callback without address. It changes the API of Socket.prototype.send() in a non-breaking manner--i.e., it adds w/o taking away.

// previous to this PR, the following would throw
socket.send(buf, port, () => {
  console.log('done');
});

@boneskull
Copy link
Contributor Author

The tests and docs do reveal that the current behaviour is not ideal, and now that you have done this work, its easier to discuss how we wish the API would work, and change it in the future.

Yes. This is silly, but would work:

socket.send(buf, port, '', () => {
  console.log('address was 127.0.0.1');
});

What's sillier is that currently you'd have to write something like the above code if you wanted a callback and didn't care to specify the address. 😝

@boneskull
Copy link
Contributor Author

boneskull commented Jan 11, 2017

To recap, my plan is:

  • Change some verbiage in docs around "specify"
  • Ensure new error messaging is consistent with core as a whole
  • Split PR into two commits:
    • tests against currently uncovered & undocumented behavior
    • implementation of and more tests around new behavior

@boneskull
Copy link
Contributor Author

@mscdex @sam-github Please take another gander.

@boneskull
Copy link
Contributor Author

boneskull commented Jan 13, 2017

I also realized this squashes a particularly gross bug--if send() was called with an invalid address type, but the socket was not yet fully bound, the send would be enqueued. Once binding was complete, the send would run, and the exception from dns.lookup() would be thrown asynchronously.

You can see this problem illustrated in the tests of my first commit. I was not able to assert the exceptions were thrown unless I did this (or caught them at the process level). I removed the up-front binding from this test file in the second commit, since my changes fix the bug.

Copy link
Contributor

@sam-github sam-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small nits, but otherwise LGTM

doc/api/dgram.md Outdated
specified or is an empty string, `'127.0.0.1'` or `'::1'` will be used instead.
DNS will be used to resolve the address of the host. If `address` is not
provided or otherwise falsy, `'127.0.0.1'` (`udp4`) or `'::1'` (`udp6`) will
be used by default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.... depending on whether socket is 'udp4' or 'udp6'.

^--- I think its worth stating explicitly how it the or is determined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sam-github this is done

const dgram = require('dgram');
const client = dgram.createSocket('udp4');

const buf = Buffer.allocUnsafe(256);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test doesn't rely on sending unspecified data, does it? Can you just use const buf = Buffer.alloc(256, 'x'); as in the other test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure; was just copypasta from yet another test...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sam-github this is done

@boneskull
Copy link
Contributor Author

conflict resolved

@Trott
Copy link
Member

Trott commented Feb 11, 2017

I'm trying to run CI against this PR and it is not going well. I'm pretty sure the problem is with CI and not this PR, though. Maybe someone else seeing this in another hour or two can try again?

@boneskull
Copy link
Contributor Author

@Trott Anything I can do to help?

@Trott
Copy link
Member

Trott commented Feb 14, 2017

@boneskull Yes! Can you remove the merge commit? Rumor has it that is the source of the issue.

@boneskull
Copy link
Contributor Author

@Trott Hm, that's what I get for using GitHub's new conflict resolver tool, I guess. 😄

- Add coverage around valid, undocumented types for `address` parameter.
- Add coverage around known invalid, but uncovered, types for `address`
  parameter.
@boneskull
Copy link
Contributor Author

@Trott rebased

@Trott
Copy link
Member

Trott commented Feb 14, 2017

@gibfahn
Copy link
Member

gibfahn commented Feb 14, 2017

This last round of changes seem like nitpicks (for lack of a better word), in my opinion. You may feel differently, but the relative impact on the PR as a whole is minimal.

@boneskull This is fair, but not being strict about things like #10473 (comment) have caused breakage in the past.

I think you're right though, what we need is to have all this stuff covered by linter rules, so that everything can all be fixed at once.


Speaking of which, regarding this (by @cjihrig):

Instead of just checking for the TypeError constructor, we've been moving toward a check like this:

/^TypeError: Invalid arguments: address must be a nonempty string or falsy$/

You can put it in a variable and reuse it across these assertions.

This seems like something we should lint for in test/, i.e. we should require the second parameter in assert.throws() to be a regex starting with ^ and ending with $. (@not-an-aardvark @Trott @silverwind @targos)

@Trott
Copy link
Member

Trott commented Feb 14, 2017

i.e. we should require the second parameter in assert.throws() to be a regex starting with ^ and ending with $.

We should definitely allow validation functions as an alternative too. It's just constructors that can be problematic.

There's more to talk about regarding this (specifically if we might be moving towards a situation where error message changes are not breaking changes anymore thanks to the error identifiers that @jasnell has introduced, and if so then whether that means fully-matching regexps will become anti-patterns as the second argument). But this is not the place for that discussion. (Personally, I think we're probably a long way off from being able to treat message changes as non-breaking changes, but I'm often wrong.)

@not-an-aardvark
Copy link
Contributor

not-an-aardvark commented Feb 14, 2017

One potential concern about linting for error constructors is that the rule might report an error for this code:

var ERROR_PATTERN = /^Error: Something bad happened$/;

// later...

// as far as the rule is concerned, ERROR_PATTERN could be a constructor
assert.throws(foo, ERROR_PATTERN);
assert.throws(bar, ERROR_PATTERN);

It would be relatively simple to check for constructors that are known beforehand, e.g. TypeError.

@Trott
Copy link
Member

Trott commented Feb 14, 2017

/cc @nodejs/ctc

This semver-major has sufficient approvals to land, but it's been a sufficiently long time coming that it might be off the radar for some folks. PTAL if you get a chance.


const client = dgram.createSocket('udp4');

const messageSent = common.mustCall(function messageSent(err, bytes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert.ifError(err)

client.send([buf1, buf2], port, messageSent);
});

client.on('message', common.mustCall(function onMessage(buf, info) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable info.

const offset = 20;
const len = buf.length - offset;

const onMessage = common.mustCall(function messageSent(err, bytes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err has to be asserted first.

Copy link
Contributor Author

@boneskull boneskull Feb 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, these two files are copypasta from test-dgram-send-callback-buffer-length (etc) (some of) which have the same issues.

@Trott
Copy link
Member

Trott commented Feb 14, 2017

OMG the test/arm results display is fixed! :-D

- Do not require presence of `address` parameter to use `callback`
  parameter; `address` is *always* optional
- Improve exception messaging if `address` is invalid type
- If `address` is an invalid type, guarantee a synchronously thrown
  exception
- Update documentation to reflect signature changes
@boneskull
Copy link
Contributor Author

@thefourtheye Changes applied

@Trott
Copy link
Member

Trott commented Feb 14, 2017

@boneskull
Copy link
Contributor Author

ping anyone

client.send(buf, port, true);
}, expectedError);

client.unref();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed this. Is there any chance that the socket would be unref'ed and the process exit before all six messages are received. I am able to make the test fail artificially by adding a short timeout around the last successful send. It doesn't seem to be a problem on the CI, but wouldn't want it to be a source of future flakiness either.

Trott pushed a commit to Trott/io.js that referenced this pull request Feb 17, 2017
- Do not require presence of `address` parameter to use `callback`
  parameter; `address` is *always* optional
- Improve exception messaging if `address` is invalid type
- If `address` is an invalid type, guarantee a synchronously thrown
  exception
- Update documentation to reflect signature changes
- Add coverage around valid, undocumented types for `address` parameter.
- Add coverage around known invalid, but uncovered, types for `address`
  parameter.

PR-URL: nodejs#10473
Reviewed-By: Sam Roberts <vieuxtech@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@Trott
Copy link
Member

Trott commented Feb 17, 2017

Landed in 32679c7.
Thanks for the contribution! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dgram Issues and PRs related to the dgram subsystem / UDP. semver-major PRs that contain breaking changes and should be released in the next major version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants