Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: compact inspect() for sparse arrays #5070

Closed
wants to merge 1 commit into from

Conversation

dcposch
Copy link
Contributor

@dcposch dcposch commented Feb 4, 2016

This prevents console.log(arr) from crashing node when given a small sparse array with large length

Fixes #4905

@Trott
Copy link
Member

Trott commented Feb 4, 2016

Thanks for the fix!

  • Would you be able to add a test for this?
  • Also could you revise the commit message to conform with the CONTRIBUTING.md guidelines? (First line should be 50 chars or less and be of the format util: fix console output for sparse array.)

@mscdex mscdex added the util Issues and PRs related to the built-in util module. label Feb 4, 2016
String(i), true));
} else {
output.push('');
var isSparse = keys.length * 10 < value.length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be explained in the comments. How this calculation determines if the array is sparse or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thefourtheye added a comment. If less than 1/10th of the slots in an array are populated, util.inspect / console.log now renders it compactly

var arr = []
arr[100] = 1
console.log(arr)

Before:
[ , , , , , (lots of commas) , 1 ]

After:
[ 100:1 ]

@dcposch dcposch force-pushed the print-sparse-array branch 2 times, most recently from 6c3bd3a to d7b8af2 Compare February 4, 2016 06:02
@dcposch
Copy link
Contributor Author

dcposch commented Feb 4, 2016

@Trott @thefourtheye sounds good, fixed

@dcposch dcposch force-pushed the print-sparse-array branch 2 times, most recently from 65033e3 to 6821944 Compare February 4, 2016 07:00
@dcposch
Copy link
Contributor Author

dcposch commented Feb 4, 2016

@Trott added test cases, LMK if these look dece

var arr = [];
arr[2] = 1; // expect util.inspect() to return [ , , 1 ]
arr[1000] = 1; // expect [ 2: 1, 1000: 1, <sparse array, length 1001> ]')
arr['foo'] = 'bar'; // expect [ 2: 1, 1000: 1, foo: 'bar', <sparse array, length 1001> ]
arr = new Array(1000000); // expect [ <empty array, length 1000000> ]

This prevents `console.log(arr)` from crashing node when given a sparse
array with large length. Fixes nodejs#4905
@dcposch dcposch force-pushed the print-sparse-array branch from 6821944 to d257683 Compare February 4, 2016 07:06
@thefourtheye
Copy link
Contributor

I think the sparse arrays are used only to prove the point. What happens when you use a dense array?

@dcposch
Copy link
Contributor Author

dcposch commented Feb 4, 2016

@thefourtheye no, I think it's specifically about sparse arrays, where a small object can potentially lead to a huge console.log output. From #4905

this can lead to a DoS attack: for example, when an user-specified JSON {"1000000000":"a"} is merged with some pre-existing array and then printed on console

...by contrast there are lots of ways to make a large JSON produce a large console.log output, even without arrays. For example, an object with millions of keys. To avoid that, we'd need to make util.inspect respect a maximum length, which is a bigger change.

@dcposch
Copy link
Contributor Author

dcposch commented Feb 4, 2016

Say you can crash a server by sending a 10GB POST, I think the fix is just to set request length limits. If you can crash a server by sending a 1KB POST which turns into a 10GB string internally, that's a bigger problem.

@thefourtheye
Copy link
Contributor

Fair enough. But I am not sure if it is acceptable to special case it in the order of 10 like this. Lets hear from @nodejs/collaborators.

@mscdex
Copy link
Contributor

mscdex commented Feb 4, 2016

Perhaps have something configurable like with require('buffer').INSPECT_MAX_BYTES.

@thefourtheye
Copy link
Contributor

+1 to @mscdex 's idea. If users want to see more data, they can explicitly adjust the value and invoke inspect

@mcollina
Copy link
Member

mcollina commented Feb 4, 2016

👍 for @mscdex idea, but with a default, something like 16 * 1024? Anyway that output is already big, and hard to be read by a human.

@vkurchatkin
Copy link
Contributor

-1. We shouldn't treat sparse arrays differently

@mik01aj
Copy link

mik01aj commented Feb 4, 2016

@vkurchatkin how would you fix the crash issue then?

@rvagg
Copy link
Member

rvagg commented Feb 4, 2016

Not sure how I feel about this, surprising behaviour if you accidentally ran into it and I'm not sure it's intuitive although I guess the message helps.

@vkurchatkin given that sparse arrays can kill Node, and this is an even more surprising and destructive behaviour, I'm not sure there's anything else we can do than treat large arrays differently.

In terms of this actually fixing said issue, it only deals with sparse arrays, not large arrays. Consider:

> a=[]; for (i=0;i<100000000;i++)a[i]=1;
1
> a

... stacktrace spew ...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Abort trap: 6

That won't be solved here

@mik01aj
Copy link

mik01aj commented Feb 4, 2016

Maybe we should treat large (and not sparse) arrays differently. E.g.: if the array has more than 1000 elements, print a ... at the end and finish.

@vkurchatkin
Copy link
Contributor

Maybe we should treat large (and not sparse) arrays differently

exactly. printing very large array to console is not useful anyway

@dcposch
Copy link
Contributor Author

dcposch commented Feb 4, 2016

I also think adding an INSPECT_MAX_BYTES with a default of few kilobytes is a good idea.

What I was saying above is, that will not just apply to arrays -- it'll have to limit output of strings and objects as well. Doable but a bit involved, esp since lib/util.js is kind of hairy and has few comments.

To give an example of what that would entail, I think we'd have to add a maxBytes argument to a bunch of functions including this one

function formatProperty(ctx, value, recurseTimes, visibleKeys, key, array) {
  var name, str, desc;
  desc = Object.getOwnPropertyDescriptor(value, key) || { value: value[key] };
  if (desc.get) {
    if (desc.set) {
      str = ctx.stylize('[Getter/Setter]', 'special');
    } else {
      str = ctx.stylize('[Getter]', 'special');
    }
  } else {
    if (desc.set) {
      str = ctx.stylize('[Setter]', 'special');
    }
  }
  if (!hasOwnProperty(visibleKeys, key)) {
    if (typeof key === 'symbol') {
      name = '[' + ctx.stylize(key.toString(), 'symbol') + ']';
    } else {
      name = '[' + key + ']';
    }
  }
  if (!str) {
    if (ctx.seen.indexOf(desc.value) < 0) {
      if (recurseTimes === null) {
        str = formatValue(ctx, desc.value, null);
      } else {
        str = formatValue(ctx, desc.value, recurseTimes - 1);
      }
      if (str.indexOf('\n') > -1) {
        if (array) {
          str = str.replace(/\n/g, '\n  ');
        } else {
          str = str.replace(/(^|\n)/g, '\n   ');
        }
      }
    } else {
      str = ctx.stylize('[Circular]', 'special');
    }
  }
  if (name === undefined) {
    if (array && key.match(/^\d+$/)) {
      return str;
    }
    name = JSON.stringify('' + key);
    if (name.match(/^"([a-zA-Z_][a-zA-Z_0-9]*)"$/)) {
      name = name.substr(1, name.length - 2);
      name = ctx.stylize(name, 'name');
    } else {
      name = name.replace(/'/g, "\\'")
                 .replace(/\\"/g, '"')
                 .replace(/(^"|"$)/g, "'")
                 .replace(/\\\\/g, '\\');
      name = ctx.stylize(name, 'string');
    }
  }

  return name + ': ' + str;
}

@dcposch
Copy link
Contributor Author

dcposch commented Feb 4, 2016

This change is smaller and at least covers the small-HTTP-POST-crashing-node case.

@dcposch dcposch changed the title Compact output when printing sparse arrays util: compact inspect() for sparse arrays Feb 4, 2016
@cjihrig
Copy link
Contributor

cjihrig commented Feb 4, 2016

To give an example of what that would entail, I think we'd have to add a maxBytes argument to a bunch of functions including this one

That might not be a bad thing. The current algorithm for identifying sparse arrays seems pretty arbitrary. And, as pointed out, it doesn't do anything about large arrays.

@Trott
Copy link
Member

Trott commented Feb 4, 2016

This change is similar to the behavior of Chrome's console with sparse arrays. It may be useful to look and see what Blink/Chrome does to decide whether to do this special rendering or the usual expected rendering.

EDIT: Specifically, how does Chrome/Blink decide that an array requires special rendering?

@jasnell
Copy link
Member

jasnell commented Feb 4, 2016

curious if this would be a feature change or a fix? minor or patch? lts or no?

@cjihrig
Copy link
Contributor

cjihrig commented Feb 4, 2016

IMO it's a breaking change.

@dcposch
Copy link
Contributor Author

dcposch commented Feb 8, 2016

@Trott just did some experimenting. In Chrome:

var a = []
console.log(a) // prints []
a[4] = 'a'
console.log(a) // prints [4: "a"]
a[1] = 'b'
console.log(a) // prints [1: "b", 4: "a"]
a[0] = 'c'
console.log(a) // prints ["c", "b", 4: "a"]

So it looks like it prints a normal dense array for all indices starting from 0, then as soon as it gets to an index that is not set (which is not the same as an index whose corresponding value is undefined), then it switches to the sparse representation.

If you want I can change this PR slightly to match that behavior.

@Trott
Copy link
Member

Trott commented Feb 8, 2016

@dcposch I don't have an opinion on that one way or the other, but someone else might.

@domenic
Copy link
Contributor

domenic commented Feb 8, 2016

Personally I can't understand how this is minor---what new feature does it introduce? It's either major (breaking change to existing feature) or patch (bugfix to existing feature).

@dcposch
Copy link
Contributor Author

dcposch commented Feb 19, 2016

@Trott @rvagg is there anything you'd like me to change?

@rvagg
Copy link
Member

rvagg commented Feb 19, 2016

I'm finding it hard to conjure up much of an opinion on this one. @nodejs/collaborators anyone else want to champion this one through? I'd like to fix the crashes but am also concerned about how people might already be relying on util.inspect() already (perhaps mistakenly but that's how it goes).

@rvagg rvagg added the semver-major PRs that contain breaking changes and should be released in the next major version. label Feb 19, 2016
@mscdex
Copy link
Contributor

mscdex commented Feb 19, 2016

As I previously mentioned, I wouldn't mind having a INSPECT_MAX_BYTES-type of thing so that arrays(/array-like objects) only show the first n elements (for consistency with Buffer inspection). Maybe have like util.INSPECT_MAX_ELEMENTS or something. I don't really care so much about the actual name, as long as the similar functionality is there.

@Trott
Copy link
Member

Trott commented Feb 19, 2016

@dcposch wrote:

@Trott @rvagg is there anything you'd like me to change?

Similar to @rvagg, I'll opt to tread carefully with this and am happy to defer to other collaborators with stronger and more-informed opinions.

@tunniclm
Copy link
Contributor

EDIT: Oops, posted the comment to the wrong place. Moved to: #4905 (comment)

@rvagg
Copy link
Member

rvagg commented Mar 22, 2016

Building on @tunniclm's suggestion that he's since moved, perhaps the easiest path forward on this one is to add an option to util.inspect() that switches on this behaviour, or multiple options as per @tunniclm's proposal that controls it but have the default behaviour be as it is now. Then we can ship this as a semver-minor and consider the question of making it the default behaviour as separate matter. The fact that this changes output and becomes a semver-major makes it difficult to fully embrace it, but if it's just a feature addition without changing default then we can consider the change without being forced to think about ecosystem impact.

@jasnell
Copy link
Member

jasnell commented Apr 21, 2016

Looking at this again, I would find it difficult to justify leaving the current crash behavior as the default. inspect() already has an options object, let's:

  • Make the default such that large arrays over some default length get the ... treatment just like we currently do with Buffer output
  • Add an option like maxArrayLength to inspect() that can override the default if necessary.
  • Mark it semver-major and get it into v6

@jasnell
Copy link
Member

jasnell commented Apr 21, 2016

I've opened an alternative PR that takes a slightly different approach to this. See #6334

@estliberitas estliberitas force-pushed the master branch 2 times, most recently from 7da4fd4 to c7066fb Compare April 26, 2016 05:23
@benjamingr
Copy link
Member

I think consensus is going towards #6334 - is this still being pursued?

@jasnell
Copy link
Member

jasnell commented Apr 30, 2016

I'd recommend closing this in favor of #6334 (but I'm a bit biased lol)

@benjamingr
Copy link
Member

I think we can close this now and it can always be reopened. It seems stalled and there seems to be consensus is with the other proposal anyway. Feel free to reopen.

@benjamingr benjamingr closed this Apr 30, 2016
jasnell added a commit to jasnell/node that referenced this pull request May 3, 2016
As an alternative to nodejs#5070,
set the max length of Arrays/TypedArrays in util.inspect() to
`100` and provide a `maxArrayLength` option to override.
jasnell added a commit that referenced this pull request May 3, 2016
As an alternative to #5070,
set the max length of Arrays/TypedArrays in util.inspect() to
`100` and provide a `maxArrayLength` option to override.

PR-URL: #6334
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Roman Reiss <me@silverwind.io>
Fishrock123 pushed a commit that referenced this pull request May 4, 2016
As an alternative to #5070,
set the max length of Arrays/TypedArrays in util.inspect() to
`100` and provide a `maxArrayLength` option to override.

PR-URL: #6334
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Roman Reiss <me@silverwind.io>
joelostrowski pushed a commit to joelostrowski/node that referenced this pull request May 4, 2016
As an alternative to nodejs#5070,
set the max length of Arrays/TypedArrays in util.inspect() to
`100` and provide a `maxArrayLength` option to override.

PR-URL: nodejs#6334
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Reviewed-By: Roman Reiss <me@silverwind.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-major PRs that contain breaking changes and should be released in the next major version. util Issues and PRs related to the built-in util module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.