Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panicked at 'already borrowed: BorrowMutError' #2562

Open
graemer957 opened this issue May 21, 2021 · 10 comments
Open

panicked at 'already borrowed: BorrowMutError' #2562

graemer957 opened this issue May 21, 2021 · 10 comments
Labels

Comments

@graemer957
Copy link

Hi 👋

Describe the Bug

I have found that under high load accessing Cloudflare KV and Cache API from Rust causes the following panic:

panicked at 'already borrowed: BorrowMutError',
/Users/graemer/.cargo/registry/src/github.com-1ecc6299db9ec823/wasm-bindgen-futures-0.4.23/src/lib.rs:133:39

I already raised a issue with them on their cloudflare/rustwasm-worker-template, but they suggested you might be in a better place to help.

Steps to Reproduce

A sample project to reproduce this can be found cfw-borrowmuterror. The sample project has two endpoints:

  • /js which fetches the foo key using just JavaScript
  • /rs which does the same, but using Rust
  1. Update project to bind KV namespace EXAMPLE_NS on your account
  2. Deploy this code to Cloudflare Workers
  3. In EXAMPLE_NS ensure the foo key exists, the value can be whatever you like
  4. Use curl to hit either the /js or /rs endpoint and ensure your value defined in (3) is returned.
  5. Use wrangler tail to capture the logs from your deployed worker
  6. To simulate load I have been using wrk, for example:
wrk -c 128 -t 4 -d 60s --latency https://<host>/rs

There are quite a few exceptions which do not generate useful logs, but if you are lucky:

{"exceptions":[{"name":"Error","message":"The script will never generate a response.","timestamp":1621326408887}],"logs":[{"message":["panicked at 'already borrowed: BorrowMutError', /Users/graemer/.cargo/registry/src/github.com-1ecc6299db9ec823/wasm-bindgen-futures-0.4.23/src/lib.rs:133:39\n\nStack:\n\nError\n    at imports.wbg.__wbg_new_59cb74e423758ede (worker.js:288:19)\n    at <anonymous>:wasm-function[28]:0x3a77\n    at <anonymous>:wasm-function[240]:0x946b\n    at <anonymous>:wasm-function[64]:0x686b\n    at <anonymous>:wasm-function[85]:0x75f4\n    at <anonymous>:wasm-function[160]:0x8cd5\n    at <anonymous>:wasm-function[139]:0x88d5\n    at <anonymous>:wasm-function[162]:0x8d38\n    at <anonymous>:wasm-function[91]:0x78c0\n    at <anonymous>:wasm-function[60]:0x6539\n\n"],"level":"error","timestamp":1621326408887}]}

Applying the same load to the /js endpoint always works. Only the /rs endpoint exhibits the issue.

I can see on L133:

let mut state = state.borrow_mut();

...and would love to be able to dig further, but at the current limits of my knowledge on how to proceed. Any help gratefully received.

Expected Behavior

No panic occurs.

Actual Behavior

Under higher loads the following panic is observed:

panicked at 'already borrowed: BorrowMutError',
/Users/graemer/.cargo/registry/src/github.com-1ecc6299db9ec823/wasm-bindgen-futures-0.4.23/src/lib.rs:133:39
@graemer957 graemer957 added the bug label May 21, 2021
@alexcrichton
Copy link
Contributor

How curious! Unfortunately I don't know how to set up a cloudflare environment or similar so the best I can offer is "debugging by reading code". Reading over this code I don't understand how this borrow can trip an error. Some questions I could ask are:

  • If possible, can you get a stack trace with function names? If not, can you build a wasm binary with debugging information and send it plus a stack trace to me?
  • Do you know if cloudflares fork of v8 futzes with when callbacks run or things like that? One way this could trip is if wasm is interrupted and a callback is run while wasm is run on the stack. I think that's not possible on the web, but cloudflare may be doing their own thing here.

Otherwise if you could get a reproduction without cloudflare (e.g. in a browser or on node) I can help debug more.

@graemer957
Copy link
Author

Hey @alexcrichton, apologies for the delay in getting back to you. Let me see if I can get that additional debug information you wanted and in the mean-time ask Cloudflare the question you asked.

@samdenty
Copy link

samdenty commented Jan 7, 2023

I can reliable reproduce this in a cloudflare worker as well @alexcrichton, how should I proceed to debug?

@samdenty
Copy link

samdenty commented Jan 8, 2023

Ok so I've managed to get a local build of wasm-bindgen-futures, compiling inside the cloudflare worker that's erroring.

I've narrowed it down to this function

inner.future.as_mut().poll(&mut cx)

Calling the poll method, in my case makes it recursively call the run function with the same future

pub(crate) fn run(&self) {

and because of this, borrow_mut fails. In my code I'm doing future.shared() so maybe this is causing this?

@alexcrichton how should I proceed on fixing this?

do you want me to give you the steps to reproduce inside a cloudflare worker?
or should I try to see why it's recursively calling it?
or maybe is there an alternative to borrow_mut such as mutexes that would fix this

@samdenty
Copy link

samdenty commented Jan 8, 2023

the error stack doesn't reveal much:

Error
    at __wbg_new_abda76e883ba8a5f (shim.js:415:19)
    at console_error_panic_hook::hook::h36ed105c675de4f7 (wasm://wasm/00f1df7a:wasm-function[1750]:0x181a6e)
    at core::ops::function::Fn::call::h9f9f084f27f78c00 (wasm://wasm/00f1df7a:wasm-function[6876]:0x2046af)
    at std::panicking::rust_panic_with_hook::had8acafc277b06ff (wasm://wasm/00f1df7a:wasm-function[2361]:0x1a7e35)
    at std::panicking::begin_panic_handler::{{closure}}::h1f07988bb2cf1601 (wasm://wasm/00f1df7a:wasm-function[2952]:0x1c3341)
    at std::sys_common::backtrace::__rust_end_short_backtrace::h9b10ea262b3bcf96 (wasm://wasm/00f1df7a:wasm-function[5362]:0x1fb990)
    at rust_begin_unwind (wasm://wasm/00f1df7a:wasm-function[4162]:0x1e79aa)
    at core::panicking::panic_fmt::h0498752af691fa0d (wasm://wasm/00f1df7a:wasm-function[4988]:0x1f6b8e)
    at core::result::unwrap_failed::hff7d8a43159e0958 (wasm://wasm/00f1df7a:wasm-function[3151]:0x1cacfc)
    at wasm_bindgen_futures::task::singlethread::Task::run::hdda06db3adb32b62 (wasm://wasm/00f1df7a:wasm-function[2200]:0x19f0be)

@Liamolucko
Copy link
Collaborator

I've narrowed it down to this function

inner.future.as_mut().poll(&mut cx)

Calling the poll method, in my case makes it recursively call the run function with the same future

pub(crate) fn run(&self) {

and because of this, borrow_mut fails.

That certainly seems like something odd is happening with Cloudflare's JS runtime, because as far as I can tell that should be impossible with a normal JS runtime.

run is only ever called from QueueState::run_all, which is only ever called from this closure:

Closure::new(move |_| state.run_all())

That closure is then only ever used as a callback for Promise::then:

let _ = self.promise.then(&self.closure);

But, callbacks from Promise::then are always meant to be run in a new microtask, which should make it impossible for it to get called recursively. So, it looks like Cloudflare is somehow breaking that rule and calling the callback synchronously, leading to the recursive call.

do you want me to give you the steps to reproduce inside a cloudflare worker?

That should be helpful, particularly if it can be done locally using workerd.

@samdenty
Copy link

samdenty commented Mar 5, 2023

the reproduction I can come up with that shows the error:

git clone https://github.com/samdenty/github-icons
cd api
wrangler dev

Once it prompts, turn on local mode by pressing L. Then in postman fetch the URL http://127.0.0.1:8787/facebook/yoga (if you navigate to it in chrome, it won't work as it'll try to redirect to homepage)

@samdenty
Copy link

samdenty commented Mar 5, 2023

Interestingly enough sometimes I get this error:

TypeError: Found invalid object in transferList
panicked at 'TypeError: Found invalid object in transferList', api/src/lib.rs:49:1

Stack:

Error:
    at he (/Users/samdenty/Projects/github-icons/api/build/worker/shim.mjs:17:1263)
    at console_error_panic_hook::hook::hfc8ebd4821eabe90 (wasm://wasm/00f1dd46:1:1579814)
    at core::ops::function::Fn::call::h3b927a197a89b48c (wasm://wasm/00f1dd46:1:2115137)
    at std::panicking::rust_panic_with_hook::had8acafc277b06ff (wasm://wasm/00f1dd46:1:1735855)
    at std::panicking::begin_panic_handler::{{closure}}::h1f07988bb2cf1601 (wasm://wasm/00f1dd46:1:1847921)
    at std::sys_common::backtrace::__rust_end_short_backtrace::h9b10ea262b3bcf96 (wasm://wasm/00f1dd46:1:2079010)
    at rust_begin_unwind (wasm://wasm/00f1dd46:1:1997116)
    at core::panicking::panic_fmt::h0498752af691fa0d (wasm://wasm/00f1dd46:1:2059040)
    at core::panicking::panic_display::h2b50038d5ebc0e22 (wasm://wasm/00f1dd46:1:2009235)
    at wasm_bindgen_futures::future_to_promise::{{closure}}::{{closure}}::h973c0b5f38fe9ea0 (wasm://wasm/00f1dd46:1:107462)

@littledivy
Copy link
Contributor

This is also happening with the Deno runtime (V8).

I'm also using future.shared() which might be causing it?

panicked at /home/divy/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wasm-bindgen-futures-0.4.37/src/task/singlethread.rs:85:37:
already borrowed: BorrowMutError

Stack:

Error
    at __wbg_new_abda76e883ba8a5f (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm.generated.js:407:19)
    at <anonymous> (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm_bg.wasm:1:1733565)
    at <anonymous> (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm_bg.wasm:1:1896441)
    at <anonymous> (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm_bg.wasm:1:1991438)
    at <anonymous> (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm_bg.wasm:1:1867142)
    at <anonymous> (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm_bg.wasm:1:1991362)
    at <anonymous> (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm_bg.wasm:1:1911615)
    at <anonymous> (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm_bg.wasm:1:1582451)
    at <anonymous> (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm_bg.wasm:1:1929769)
    at __wbg_adapter_48 (file:///home/divy/gh/littledivy/deno_graph/js/deno_graph_wasm.generated.js:240:8)

@Liamolucko
Copy link
Collaborator

This is also happening with the Deno runtime (V8).

I can reproduce it, but there's also another panic unrelated to wasm-bindgen that comes before the BorrowMutError:

panicked at src/rt.rs:59:9:
task panic

Stack:

Error
    at __wbg_new_abda76e883ba8a5f (file:///Users/liam/src/deno_graph/js/deno_graph_wasm.generated.js:407:19)
    at console_error_panic_hook::hook::hc7726a2ffeb2e6df (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:1622697)
    at std::panicking::rust_panic_with_hook::h62090d3fc0630473 (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:1888334)
    at std::panicking::begin_panic_handler::{{closure}}::h823c80ef219a166d (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:1925302)
    at std::sys_common::backtrace::__rust_end_short_backtrace::hcc7222b556723586 (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:2169837)
    at rust_begin_unwind (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:2089417)
    at core::panicking::panic_fmt::hd79411a297d06dc8 (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:2107439)
    at <futures_util::future::future::shared::Shared<Fut> as core::future::future::Future>::poll::he17c05c9d6f0b86a (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:1376784)
    at deno_graph::graph::Builder::build::{{closure}}::h074e108a436f383e (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:491401)
    at wasm_bindgen_futures::future_to_promise::{{closure}}::{{closure}}::h6b315cce3e501611 (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:621880)


panicked at /Users/liam/src/wasm-bindgen/crates/futures/src/task/singlethread.rs:86:37:
already borrowed: BorrowMutError

Stack:

Error
    at __wbg_new_abda76e883ba8a5f (file:///Users/liam/src/deno_graph/js/deno_graph_wasm.generated.js:407:19)
    at console_error_panic_hook::hook::hc7726a2ffeb2e6df (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:1622697)
    at std::panicking::rust_panic_with_hook::h62090d3fc0630473 (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:1888334)
    at std::panicking::begin_panic_handler::{{closure}}::h823c80ef219a166d (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:1925353)
    at std::sys_common::backtrace::__rust_end_short_backtrace::hcc7222b556723586 (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:2169837)
    at rust_begin_unwind (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:2089417)
    at core::panicking::panic_fmt::hd79411a297d06dc8 (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:2107439)
    at core::cell::panic_already_borrowed::h8c62ffbd80fe44ce (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:2068007)
    at core::cell::RefCell<T>::borrow_mut::h040c458a50483dfc (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:2131248)
    at wasm_bindgen_futures::queue::Queue::new::{{closure}}::h31418d075626539f (file:///Users/liam/src/deno_graph/js/deno_graph_wasm_bg.wasm:1:1740749)

So I think the issue in that case is that the task is panicking while the RefCell is still borrowed; and since Rust currently doesn't support unwinding on Wasm, the guard doesn't get dropped and so when the future gets polled again the RefCell is still borrowed.

So I don't think that's really wasm-bindgen's issue, although a better error message might be warranted. Fixing #3687 would change it to a 'module used after aborting' error which would be a bit better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants