AWS lambda - process is killed before the span is being exported #1295

obecny · 2020-07-09T16:35:14Z

This is a simple scenario of lambda function

exports.handler = ( async ( event, ctx ) => {
    const span = tracer.startSpan('test');
    span.end();
    return 'done';
} );

The span is not exported correctly as lambda seems to kill the process before it gets exported
When adding some timeout before the process get killed the span iss exported

exports.handler = ( async ( event, ctx ) => {
    const span = tracer.startSpan('test');
    span.end();
    await new Promise((resolve)=> {
        setTimeout(()=> {
            resolve();
        }, 500);
    })
    
    return 'done';
} );

So the idea is to implement some async method either on tracer or exporter or change the exporter method shutdown to be a promise and to be resolved when the last exports is sent successfully

WDYT ?

The text was updated successfully, but these errors were encountered:

obecny · 2020-07-09T16:35:39Z

@open-telemetry/javascript-maintainers && @open-telemetry/javascript-approvers

dyladan · 2020-07-09T16:37:04Z

I think this can be solved by adding a callback to the forceFlush. This will help us in other shutdown scenarios like described in #1290.

lykkin · 2020-07-09T16:59:44Z

FWIW, when I was working on something similar to this it was required to have a synchronous export pathway for the data to escape through. Lambda will freeze execution of the process after the lambda has been marked as complete (i.e. the response callback was called, the promise returned by the handler settles, or the process attempts to exit due to inactivity). In the case where the response callback is called, the async work that was pending will start back up the next time the process is unfrozen, which will result in unpredictable export timings. @michaelgoin or @astorm might also have more context to add to this.

dyladan · 2020-07-09T17:08:16Z

The behavior when the callback is called is dictated by the context.callbackWaitsForEmptyEventLoop property. If it is true, then any work on the event loop (like a flush) will keep the invocation running until it completes (or times out). If it is false, the execution environment is frozen immediately.

When using an async handler, the function is frozen immediately when the returned promise resolves/rejects. It may or may not be reused. If it is, async work continues to process.

In both cases, if the user calls force flush in their handler code and only finishes the handler after the force flush is complete, then this is fixed.

note: it may be tempting to call the handler callback early and have the force flush continue in the background, but the lambda runtime actually doesn't return the response to the user until the whole function invocation is complete. This is a very frustrating situation, but it is true. For instance, the following handler for an API gateway request will return the response to the user in ~500ms rather than ~0ms as you would expect:

module.exports.handler = (event, context, cb) => {
  cb(); // return immediately
  setTimeout(() => {}, 500); // keep tasks on the event loop for 500ms
};

time curl ...
curl ...  0.00s user 0.01s system 2% cpu 0.621 total

obecny · 2020-07-09T17:21:24Z

@dyladan do you know if there is a way to get access to this callbackWaitsForEmptyEventLoop function or somehow patch the whole flow so that we could have a plugin that does that automatically and waits until all the exporters are shutdown correctly ?

obecny · 2020-07-09T17:23:33Z

or maybe we can patch the whole exports.handler

lykkin · 2020-07-09T17:34:51Z

@obecny callbackWaitsForEmptyEventLoop is just a property on the context parameter passed into the handler, so it would suffice to have a method that sets it for a user, if we are worried about a user misconfiguring it. Depending on how much (or how little) code we expect the user to write, we could also patch the handler like you said. Something like

function patchLambdaHandler(handler) {
  return async function wrappedHandler(event, context, cb) {
    context.callbackWaitsForEmptyEventLoop = true; // this could still be overwritten by the user
    const res = handler(event, context, cb);  // should be wrapped in a try catch, technically.
    await otel.forceFlush();
    return res;
  }
}

This does potentially modify the execution semantics for the lambda from under the user, but it would be roughly what we would want to do.

dyladan · 2020-07-09T17:38:26Z

In an autoinstrumentation I would suggest you either:

patch the handler, and prevent the wrapped handler from finishing until the export is complete
patch the underlying runtime itself. This is more risky, but it turns out the API of the RAPIDClient (their internal name for the runtime environment) is much simpler and more straightforward. This is going to be what I do for AWS Lambda - http opentelemetry-js-contrib#132

For this issue, I think it is sufficient to give the user a way to manually flush and not worry about wrapping handler or patching anything.

dyladan · 2020-07-09T17:42:42Z

patch the underlying runtime itself. This is more risky, but it turns out the API of the RAPIDClient (their internal name for the runtime environment) is much simpler and more straightforward. This is going to be what I do for AWS Lambda - http opentelemetry-js-contrib#132

Actually I have already been working on this and will have a PR shortly.

dyladan · 2020-07-09T17:43:58Z

@obecny if you have not started working on this, i actually already have an implementation of forceFlush callbacks that I can open a PR with

dyladan · 2020-07-09T17:59:25Z

I'll just open the PR, let me know if i should close it.

obecny · 2020-07-09T18:18:35Z

@dyladan great, no I haven't started working on that. I think the final fix / solution will need to do probably several things so I will wait until you make PR to see what you have discovered and done :) .

dyladan · 2020-07-09T18:20:12Z

Ok. Here it is #1296

All this PR does is provide a mechanism to wait for the flush to complete. The lambda plugin PR will be coming later. Probably early next week.

longility · 2021-08-27T20:24:49Z

The lambda plugin PR will be coming later. Probably early next week.

@dyladan Do you have a reference to this lambda plugin or guidance on aws lambda implementation? I'm noticing that my lambda is not exporting the spans.

dyladan · 2021-08-27T20:29:44Z

Actually someone else ended up writing the lambda plugin so I never had to :)

You can find it in the contrib repo.

longility · 2021-08-27T20:37:08Z

Weird, I have that in there but something isn't working still. Let me see if I can dig deeper. Feel free to share any debug techniques or what to look for in the logs.

dyladan · 2021-08-27T21:06:06Z

I would open an issue on the contrib repo and ping @willarmiros or @anuraaga since they maintain that instrumentation

obecny added bug Something isn't working Discussion Issue or PR that needs/is extended discussion. labels Jul 9, 2020

obecny self-assigned this Jul 9, 2020

dyladan mentioned this issue Jul 9, 2020

feat: force flush and shutdown callback for span exporters #1296

Merged

obecny assigned dyladan and unassigned obecny Jul 9, 2020

dyladan closed this as completed in #1296 Jul 13, 2020

longility mentioned this issue Aug 27, 2021

I'm expecting node aws lambda to export spans properly open-telemetry/opentelemetry-js-contrib#647

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS lambda - process is killed before the span is being exported #1295

AWS lambda - process is killed before the span is being exported #1295

obecny commented Jul 9, 2020

obecny commented Jul 9, 2020

dyladan commented Jul 9, 2020

lykkin commented Jul 9, 2020

dyladan commented Jul 9, 2020 •

edited

Loading

obecny commented Jul 9, 2020

obecny commented Jul 9, 2020

lykkin commented Jul 9, 2020

dyladan commented Jul 9, 2020

dyladan commented Jul 9, 2020

dyladan commented Jul 9, 2020

dyladan commented Jul 9, 2020

obecny commented Jul 9, 2020

dyladan commented Jul 9, 2020

longility commented Aug 27, 2021

dyladan commented Aug 27, 2021

longility commented Aug 27, 2021

dyladan commented Aug 27, 2021

AWS lambda - process is killed before the span is being exported #1295

AWS lambda - process is killed before the span is being exported #1295

Comments

obecny commented Jul 9, 2020

obecny commented Jul 9, 2020

dyladan commented Jul 9, 2020

lykkin commented Jul 9, 2020

dyladan commented Jul 9, 2020 • edited Loading

obecny commented Jul 9, 2020

obecny commented Jul 9, 2020

lykkin commented Jul 9, 2020

dyladan commented Jul 9, 2020

dyladan commented Jul 9, 2020

dyladan commented Jul 9, 2020

dyladan commented Jul 9, 2020

obecny commented Jul 9, 2020

dyladan commented Jul 9, 2020

longility commented Aug 27, 2021

dyladan commented Aug 27, 2021

longility commented Aug 27, 2021

dyladan commented Aug 27, 2021

dyladan commented Jul 9, 2020 •

edited

Loading