Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repl: printing big strings results in OOM #25478

Closed
joyeecheung opened this issue Jan 13, 2019 · 10 comments
Closed

repl: printing big strings results in OOM #25478

joyeecheung opened this issue Jan 13, 2019 · 10 comments
Labels
help wanted Issues that need assistance from volunteers or PRs that need help to proceed. memory Issues and PRs related to the memory management or memory footprint. repl Issues and PRs related to the REPL subsystem. stream Issues and PRs related to the stream subsystem.

Comments

@joyeecheung
Copy link
Member

joyeecheung commented Jan 13, 2019

From https://bugs.chromium.org/p/v8/issues/detail?id=8679

How to reproduce:

$ node
> 'a'.repeat(1e9)
  • Version: master
  • Platform: all
  • Subsystem: repl

Reposting my comment in the v8 issue:

The snippet does not crash if run as a normal Node.js script, if the length is even longer (e.g. 9*10000*148*64*64) it throws RangeError: Invalid string length.

In Node it only crashes in the REPL when being written to the console after being formatted with color codes and surrounding ' - that invokes v8::String::Utf8Length to precompute the size to allocate for another copy of the string, which caused a OOM when flattening the string.

It does not look like a problem with String.p.repeat. AFAICT its's more like a problem with how Node creates too many copies of the string in its console implementation (it doesn't even crash if you write 'A'.repeat(9*10000*148*64) to stdout and redirect to a file).

Stack trace:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
  * frame #0: 0x00007fff60b82b66 libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff60d4d080 libsystem_pthread.dylib`pthread_kill + 333
    frame #2: 0x00007fff60ade1ae libsystem_c.dylib`abort + 127
    frame #3: 0x000000010015025f node_g`node::Abort() at node_errors.cc:161
    frame #4: 0x0000000100150c8f node_g`node::OnFatalError(location="CALL_AND_RETRY_LAST", message="Allocation failed - JavaScript heap out of memory") at node_errors.cc:316
    frame #5: 0x0000000100509fa2 node_g`v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) at api.cc:462 [opt]
    frame #6: 0x0000000100509f99 node_g`v8::internal::V8::FatalProcessOutOfMemory(isolate=0x0000000105000000, location=<unavailable>, is_heap_oom=true) at api.cc:428 [opt]
    frame #7: 0x0000000100b1ff72 node_g`v8::internal::Heap::FatalProcessOutOfMemory(this=<unavailable>, location=<unavailable>) at heap.cc:4864 [opt]
    frame #8: 0x0000000100b2bb10 node_g`v8::internal::Heap::AllocateRawWithRetryOrFail(this=<unavailable>, size=<unavailable>, space=<unavailable>, alignment=<unavailable>) at heap.cc:4292 [opt]
    frame #9: 0x0000000100ae7749 node_g`v8::internal::Factory::NewRawOneByteString(int, v8::internal::PretenureFlag) [inlined] v8::internal::Factory::AllocateRawWithImmortalMap(this=<unavailable>, size=<unavailable>, pretenure=<unavailable>, map=<unavailable>, alignment=kWordAligned) at factory.cc:120 [opt]
    frame #10: 0x0000000100ae7732 node_g`v8::internal::Factory::NewRawOneByteString(this=0x0000000105000000, length=1000000013, pretenure=NOT_TENURED) at factory.cc:954 [opt]
    frame #11: 0x0000000100d062c4 node_g`v8::internal::String::SlowFlatten(isolate=0x0000000105000000, cons=<unavailable>, pretenure=NOT_TENURED) at objects.cc:2602 [opt]
    frame #12: 0x0000000100543626 node_g`v8::String::Utf8Length(this=<unavailable>, isolate=<unavailable>) const at api.cc:5313 [opt]
    frame #13: 0x0000000100312827 node_g`node::StringBytes::Size(isolate=0x0000000105000000, val=(val_ = 0x00007ffeefbf4240), encoding=UTF8) at string_bytes.cc:474
    frame #14: 0x0000000100308db0 node_g`int node::StreamBase::WriteString<(node::encoding)1>(this=0x0000000103d0e9e0, args=0x00007ffeefbf3fa0) at stream_base.cc:200
    frame #15: 0x00000001003105d2 node_g`void node::StreamBase::JSMethod<node::LibuvStreamWrap, &(int node::StreamBase::WriteString<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&))>(args=0x00007ffeefbf3fa0) at stream_base-inl.h:409
    frame #16: 0x000000010061059a node_g`v8::internal::FunctionCallbackArguments::Call(this=0x00007ffeefbf4008, handler=<unavailable>) at api-arguments-inl.h:140 [opt]
    frame #17: 0x000000010060e9cf node_g`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(isolate=<unavailable>, function=<unavailable>, new_target=<unavailable>, fun_data=<unavailable>, receiver=<unavailable>, args=BuiltinArguments @ 0x00007ffeefbf40c0) at builtins-api.cc:109 [opt]
    frame #18: 0x000000010060cdc6 node_g`v8::internal::Builtin_Impl_HandleApiCall(args=BuiltinArguments @ 0x00007ffeefbf40f8, isolate=0x0000000105000000) at builtins-api.cc:139 [opt]
    frame #19: 0x000000010060c91c node_g`v8::internal::Builtin_HandleApiCall(args_length=7, args_object=0x00007ffeefbf4250, isolate=0x0000000105000000) at builtins-api.cc:127 [opt]
    frame #20: 0x0000000101ad1d55 node_g`v8_Default_embedded_blob_ + 2915605
    frame #21: 0x0000000101ad1d55 node_g`v8_Default_embedded_blob_ + 2915605
    frame #22: 0x000039cd7458cb8e
    frame #23: 0x000039cd7458cb8e
    frame #24: 0x000039cd7458cb8e
    frame #25: 0x000039cd7458cb8e
    frame #26: 0x000039cd7458cb8e
    frame #27: 0x000039cd7458cb8e
    frame #28: 0x000039cd7458cb8e
    frame #29: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #30: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #31: 0x000039cd7458cb8e
    frame #32: 0x000039cd7458cb8e
    frame #33: 0x000039cd7458cb8e
    frame #34: 0x000039cd7458cb8e
    frame #35: 0x000039cd7458cb8e
    frame #36: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #37: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #38: 0x000039cd7458cb8e
    frame #39: 0x000039cd7458cb8e
    frame #40: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #41: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #42: 0x000039cd7458cb8e
    frame #43: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #44: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #45: 0x000039cd7458cb8e
    frame #46: 0x000039cd7458cb8e
    frame #47: 0x000039cd7458cb8e
    frame #48: 0x000039cd7458cb8e
    frame #49: 0x000039cd7458cb8e
    frame #50: 0x000039cd7458cb8e
    frame #51: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #52: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #53: 0x000039cd7458cb8e
    frame #54: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #55: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #56: 0x000039cd7458cb8e
    frame #57: 0x00000001018e472a node_g`v8_Default_embedded_blob_ + 894698
    frame #58: 0x00000001018e472a node_g`v8_Default_embedded_blob_ + 894698
    frame #59: 0x000039cd7458cb8e
    frame #60: 0x000039cd7458cb8e
    frame #61: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #62: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #63: 0x000039cd7458cb8e
    frame #64: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #65: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #66: 0x000039cd7458cb8e
    frame #67: 0x000039cd7458cb8e
    frame #68: 0x000039cd7458cb8e
    frame #69: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #70: 0x000000010180d626 node_g`v8_Default_embedded_blob_ + 13798
    frame #71: 0x000039cd7458cb8e
    frame #72: 0x0000000101816f63 node_g`v8_Default_embedded_blob_ + 53027
    frame #73: 0x0000000101816f63 node_g`v8_Default_embedded_blob_ + 53027
    frame #74: 0x000039cd74585f7e
    frame #75: 0x0000000100aa1ed5 node_g`v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, v8::internal::Handle<v8::internal::Object>, v8::internal::Execution::MessageHandling, v8::internal::Execution::Target) [inlined] v8::internal::GeneratedCode<v8::internal::Object*, v8::internal::Object*, v8::internal::Object*, v8::internal::Object*, int, v8::internal::Object***>::Call(args=<unavailable>, args=<unavailable>, args=<unavailable>, args=1, args=0x00007ffeefbf5ee0) at simulator.h:113 [opt]
    frame #76: 0x0000000100aa1ec5 node_g`v8::internal::(anonymous namespace)::Invoke(isolate=0x0000000000000007, is_construct=true, target=<unavailable>, receiver=<unavailable>, argc=1, args=0x00007ffeefbf5ee0, new_target=<unavailable>, message_handling=kReport, execution_target=kCallable) at execution.cc:155 [opt]
    frame #77: 0x0000000100aa1424 node_g`v8::internal::(anonymous namespace)::CallInternal(isolate=0x0000000105000000, callable=Handle<v8::internal::Object> @ r13, receiver=<unavailable>, argc=32766, argv=0x0000000105000000, message_handling=<unavailable>, target=kCallable) at execution.cc:191 [opt]
    frame #78: 0x0000000100aa1227 node_g`v8::internal::Execution::Call(isolate=<unavailable>, callable=<unavailable>, receiver=<unavailable>, argc=<unavailable>, argv=<unavailable>) at execution.cc:202 [opt]
    frame #79: 0x0000000100540c9a node_g`v8::Function::Call(this=0x00000001060672e0, context=<unavailable>, recv=<unavailable>, argc=<unavailable>, argv=<unavailable>) at api.cc:5019 [opt]
    frame #80: 0x00000001000266ca node_g`node::InternalMakeCallback(env=0x00007ffeefbfe6f8, recv=(val_ = 0x0000000106072300), callback=(val_ = 0x00000001060672e0), argc=1, argv=0x00007ffeefbf5ee0, asyncContext=(async_id = 4, trigger_async_id = 1)) at callback_scope.cc:150
    frame #81: 0x0000000100019bf9 node_g`node::AsyncWrap::MakeCallback(this=0x0000000103d0f090, cb=(val_ = 0x00000001060672e0), argc=1, argv=0x00007ffeefbf5ee0) at async_wrap.cc:680
    frame #82: 0x000000010003157c node_g`node::AsyncWrap::MakeCallback(this=0x0000000103d0f090, symbol=(val_ = 0x00000001060697c8), argc=1, argv=0x00007ffeefbf5ee0) at async_wrap-inl.h:92
    frame #83: 0x0000000100031313 node_g`node::AsyncWrap::MakeCallback(this=0x0000000103d0f090, symbol=(val_ = 0x00000001060697c8), argc=1, argv=0x00007ffeefbf5ee0) at async_wrap-inl.h:68
    frame #84: 0x0000000100307b48 node_g`node::StreamBase::CallJSOnreadMethod(this=0x0000000103d0f090, nread=1, ab=(val_ = 0x00000001060672d8), offset=0) at stream_base.cc:314
    frame #85: 0x0000000100307e60 node_g`node::EmitToJSStreamListener::OnStreamRead(this=0x0000000103d0f108, nread=1, buf=0x00007ffeefbf61c8) at stream_base.cc:376
    frame #86: 0x00000001000b8b38 node_g`node::StreamResource::EmitRead(this=0x0000000103d0f090, nread=1, buf=0x00007ffeefbf61c8) at stream_base-inl.h:128
    frame #87: 0x000000010030e90c node_g`node::LibuvStreamWrap::OnUvRead(this=0x0000000103d0f090, nread=1, buf=0x00007ffeefbf61c8) at stream_wrap.cc:259
    frame #88: 0x000000010030f83f node_g`node::LibuvStreamWrap::ReadStart(this=0x000000010030f7e0, stream=0x0000000103d0f128, nread=1, buf=0x00007ffeefbf61c8)::$_2::operator()(uv_stream_s*, long, uv_buf_t const*) const at stream_wrap.cc:181
    frame #89: 0x000000010030f808 node_g`node::LibuvStreamWrap::ReadStart(stream=0x0000000103d0f128, nread=1, buf=0x00007ffeefbf61c8)::$_2::__invoke(uv_stream_s*, long, uv_buf_t const*) at stream_wrap.cc:180
    frame #90: 0x00000001011ed0fa node_g`uv__read(stream=0x0000000103d0f128) at stream.c:1257
    frame #91: 0x00000001011e8f4c node_g`uv__stream_io(loop=0x00000001027f06a0, w=0x0000000103d0f1b0, events=1) at stream.c:1324
    frame #92: 0x00000001011f734c node_g`uv__io_poll(loop=0x00000001027f06a0, timeout=-1) at kqueue.c:343
    frame #93: 0x00000001011d9d6f node_g`uv_run(loop=0x00000001027f06a0, mode=UV_RUN_DEFAULT) at core.c:370
    frame #94: 0x00000001000fcc92 node_g`node::Start(isolate=0x0000000105000000, isolate_data=0x0000000106068c00, args=size=1, exec_args=size=0) at node.cc:1573
    frame #95: 0x00000001000f3640 node_g`node::Start(event_loop=0x00000001027f06a0, args=size=1, exec_args=size=0) at node.cc:1680
    frame #96: 0x00000001000f2d5f node_g`node::Start(argc=1, argv=0x0000000103f11b50) at node.cc:1739
    frame #97: 0x000000010145412e node_g`main(argc=1, argv=0x00007ffeefbff6f0) at node_main.cc:126
    frame #98: 0x0000000100001034 node_g`start + 52

Top frames of _v8_internal_Print_StackTrace()

[22]: _ttyWrite [0x2a040e98bd71] [readline.js:934]
[23]: _ttyWrite [0x2a040e9d3629] [repl.js:714]
[24]: onkeypress [0x2a04cec2e111] [readline.js:184]
@joyeecheung joyeecheung added repl Issues and PRs related to the REPL subsystem. memory Issues and PRs related to the memory management or memory footprint. stream Issues and PRs related to the stream subsystem. labels Jan 13, 2019
@addaleax
Copy link
Member

Is (part) of the issue that we write out a flattened string to the stream?

@joyeecheung
Copy link
Member Author

joyeecheung commented Jan 13, 2019

@addaleax Sort of, we use v8::String::Utf8Length() to precompute the storage size, which flattens the string, and v8::internal::String::SlowFlatten first allocate a chunk of memory with the size of the string according to the representation, then write the flattened content into it before replacing part of the resulting ConsString with the flattened string. So with a 1GB string we at least double the v8 heap memory usage (albeit temporarily), which causes the OOM for the default old space size.

I am not sure what's the best way to work around this, or if it even should be fixed. It doesn't seem unreasonable for V8 to create a temporary copy to flatten the string and calculate the length of the string encoded in UTF8. Even if we somehow manage to avoid the flattening of the string that triggers a temporary copy in StreamBase::WriteString, the readline/repl implementation may still do some operation on the string itself (e.g. indexing) that result in flattening.

Another fun observation - if I configure node with --max_old_space_size=4096 before doing a 'a'.repeat(1e9) in the REPL, I get this:

events.js:173
      throw er; // Unhandled 'error' event
      ^

Error [ERR_STREAM_DESTROYED]: Cannot call write after a stream was destroyed
    at doWrite (_stream_writable.js:411:19)
    at writeOrBuffer (_stream_writable.js:399:5)
    at WriteStream.Writable.write (_stream_writable.js:299:11)
    at Domain.debugDomainError (repl.js:449:22)
    at Domain.emit (events.js:188:13)
    at Domain.EventEmitter.emit (domain.js:432:20)
    at Domain._errorHandler (domain.js:216:23)
    at Object.setUncaughtExceptionCaptureCallback (domain.js:132:29)
    at process._fatalException (internal/process/execution.js:102:29)

@BridgeAR
Copy link
Member

BridgeAR commented Mar 4, 2019

To me this looks like a known limitation and I would not know how to overcome this problem.

@BridgeAR
Copy link
Member

@joyeecheung @addaleax is this something we should really look into at the moment? I would otherwise suggest to close this. If someone wants to improve this somehow, that's awesome but I think things work relatively well for now that it's not really required?

@joyeecheung
Copy link
Member Author

@BridgeAR I wouldn’t think this counts as working well (this should throw instead of crash), but yeah, there is no plan fixing this - not because this is a wontfix but more like we don’t know how to fix this. So I guess we can add helped wanted and see if there’s help.

@joyeecheung joyeecheung added the help wanted Issues that need assistance from volunteers or PRs that need help to proceed. label Dec 13, 2019
@addaleax
Copy link
Member

@joyeecheung @BridgeAR Would it make sense to try and add a V8 API to read a cons string into multiple Buffers? I’m not sure if it helps here, but it would make things like this easier?

@addaleax
Copy link
Member

Thinking about this, I think it’s actually pretty unreasonable for util.inspect() to generate output this large. We truncate Buffers at a relatively low length, so it should be okay to truncate strings, too? Even if that happens at a few thousand characters, it would help with issues like this.

@joyeecheung
Copy link
Member Author

joyeecheung commented Feb 20, 2020

@addaleax Yeah, I think once the length of the string exceeds a certain value it's not really useful to display the whole string any more. The issue is before deciding whether to truncate the string, we first need to know at least the length of it, which requires flattening (and thus copying) of cons strings. Maybe one way to go about it is to implement a V8 API that stops the flattening once a specified maximum length is reached, though it still only gives us a little bit of room for strings that are long but not that long for the machine running to code, and we still have to crash for strings longer than that. EDIT: I don't know why I came to that conclusion, I think this should work.

@bnoordhuis
Copy link
Member

It should be possible to teach v8::String::Write(), v8::String::WriteUtf8(), etc. to not flatten cons strings when their length exceeds a certain threshold, no new API necessary.

Right now they always call v8::internal::String::Flatten() for ease of use but there's no reason they have to.

There might already be (partial?) support for that in the v8::internal::String::WriteToFlat() helper.

rosaxxny added a commit to rosaxxny/node that referenced this issue Apr 3, 2020
addaleax pushed a commit that referenced this issue Apr 8, 2020
Refs: #25478

PR-URL: #32392
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this issue Apr 12, 2020
Refs: #25478

PR-URL: #32392
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
BethGriggs pushed a commit that referenced this issue Apr 14, 2020
Refs: #25478

PR-URL: #32392
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit to targos/node that referenced this issue Apr 25, 2020
Refs: nodejs#25478

PR-URL: nodejs#32392
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
targos pushed a commit that referenced this issue Apr 28, 2020
Refs: #25478

PR-URL: #32392
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Matheus Marchini <mat@mmarchini.me>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: James M Snell <jasnell@gmail.com>
@BridgeAR
Copy link
Member

BridgeAR commented May 7, 2020

This is fixed in newer Node.js versions by limiting the string inspection.

@BridgeAR BridgeAR closed this as completed May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issues that need assistance from volunteers or PRs that need help to proceed. memory Issues and PRs related to the memory management or memory footprint. repl Issues and PRs related to the REPL subsystem. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

No branches or pull requests

4 participants