[MR-2875] Bulk Downloads 502 Error #1373

macrael · 2022-12-08T19:13:56Z

Summary

This PR resolves our issue with bulk downloads not working.

The bug ultimately was due to the way we were invoking the zip archive, triggered by the change to AWS v3 returning a stream instead of a buffer (i’m pretty sure it was a buffer but it definitely was different). I’m still a little confused by this, because we appear to be handling the archive exactly how it’s documented in their examples, but as is it was crashing the bulk download lambda every time.

The zip archive was configured inside of a promise, writing to a stream called streamPassThrough. The promise was set to be resolved when that stream emitted the close or end events. After that promise resolved, we passed that stream to s3.PutObject to be uploaded. With the internal AWS v3 change to pass a stream in instead of a buffer, it appears that the close and end events are no longer emitted during the zip process and so I think that the lambda was killed for being promise-locked. I’m still not 100% clear on this because how ever the lambda was dying it was not getting caught by any of our catch clauses. It just died mid stream as far as I could tell. My main fix here was to take the zip out of the promise and just await the s3 upload in the end to sync all the stream processing.

The s3.PutObject command was not built to work with streams, when passed an incomplete stream it errored. So I imported the s3.Upload command which is a higher level command built to work with streams. It also can manage tags so that consolidated some of our s3 activity.

Finally, the first thing I did when I was looking at this was trying to straighten out the type of what we were getting back from our initial s3.GetObject calls. This was frustrating and ulitmately fruitless, the type was in fact a Readble which is what we were coercing the return type to anyway. I assumed that since the docs said that s3 was now returning a stream that it would in fact be returning the ReadableStream type to us, and trying to get that type to fit into a Readable which is what the zip archiver wanted was a real pain. Here’s what I learned:

ReadableStream is a browser standard that a bunch of tooling uses
Readable is a node type
(There used to be a node ReadableStream type which some of the Readable functions can still interact with, unrelated to the browser type)
node implements a set of the browser types itself, the Readable.fromWeb() takes the new node ReadableStream type.
That node ReadableStream is actually slightly incompatible with the actual ReadableStream type implemented by browsers, it has some extra functions that are part of the standard but are not implemented by any browsers. This prevents you from passing the ReadableStream that AWS returns into the Readable.fromWeb() fucntion that would get it into node land. 😡
Since AWS is returning a browser standard type, it expects the “dom” lib to be loaded in typescript. Without that it just returns a stub of ReadableStream that has no members.
Since it was such a pain to figure this all out and I did get the right types connected in the end I left all those changes in even though in my testing that code path is never called b/c AWS always returns a Readable instance

Related issues

https://qmacbis.atlassian.net/browse/MR-2875

haworku · 2022-12-08T20:10:27Z

services/app-api/src/handlers/bulk_download.ts

+                }
+            })
+        )
+    } catch (e) {


😞 I wonder if this is a change we should have everywhere (catch errors, log for cloudwatch, also return some more specific error) . Idk why I didn't notice we weren't try catching our async actions in handlers that should probably be our standard.

Also eslint should warn on this too? Gonna check that out

haworku · 2022-12-12T16:28:31Z

services/app-api/tsconfig.json

@@ -2,7 +2,7 @@
    "extends": "../../tsconfig.json",
    "compilerOptions": {
        "module": "commonjs",
-        "lib": ["es2021"],
+        "lib": ["es2021", "DOM"],


this feels... unexpected? What this still needed or is it from previous debugging

The s3 APIs return a ReadableStream which is a browser type. Without this the types from the AWS api were coming back without the right information. In the end we’re not actually using that type, but I think we should keep this so the types stay valid

haworku

Looks good I see things working.

Not directly related but just sharing a finding from manual QA - the multi-rate behavior for the "download all rate documents" button just groups everything in the same folder. That surprised me, I was expecting folders or some kind of re-naming in the files to group things a bit more. Not sure if we intentionally designed things that way or just never actively re-designed this feature with multi-rates.

macrael · 2022-12-12T20:28:57Z

I think we never re-designed it, that’s a good question to bring up for the whole team

mojotalantikite · 2022-12-12T20:40:23Z

Cool, thanks for writing out that summary! Code looks good and things are working for me as expected in the review app.

get errors printing for the s3 streams

31c0cca

macrael changed the title ~~Bulk Downloads 502 Error~~ [MR-2875] Bulk Downloads 502 Error Dec 8, 2022

github-actions bot temporarily deployed to review-apps December 8, 2022 19:15 Inactive

haworku reviewed Dec 8, 2022

View reviewed changes

more error handling

b36b5f6

github-actions bot temporarily deployed to review-apps December 8, 2022 21:03 Inactive

error a little outside

38bf3ef

github-actions bot temporarily deployed to review-apps December 8, 2022 22:43 Inactive

more logging

44d61aa

github-actions bot temporarily deployed to review-apps December 8, 2022 23:16 Inactive

try and transform to readable

03d9535

github-actions bot temporarily deployed to review-apps December 9, 2022 01:03 Inactive

This works, needs cleanup

57e9484

github-actions bot temporarily deployed to review-apps December 10, 2022 02:06 Inactive

macrael added 2 commits December 9, 2022 18:45

Cleanup bulk download

a32fe58

Merge remote-tracking branch 'origin/main' into wml-download-all-502

9911ae1

github-actions bot temporarily deployed to review-apps December 10, 2022 02:48 Inactive

github-actions bot temporarily deployed to review-apps December 10, 2022 02:55 Inactive

haworku reviewed Dec 12, 2022

View reviewed changes

haworku requested a review from mojotalantikite December 12, 2022 16:28

macrael marked this pull request as ready for review December 12, 2022 17:05

haworku approved these changes Dec 12, 2022

View reviewed changes

mojotalantikite approved these changes Dec 12, 2022

View reviewed changes

add timing to error exits

d6d8f7b

github-actions bot temporarily deployed to review-apps December 12, 2022 23:36 Inactive

macrael merged commit 28fd843 into main Dec 13, 2022

macrael deleted the wml-download-all-502 branch December 13, 2022 03:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MR-2875] Bulk Downloads 502 Error #1373

[MR-2875] Bulk Downloads 502 Error #1373

macrael commented Dec 8, 2022 •

edited

Loading

haworku Dec 8, 2022 •

edited

Loading

haworku Dec 12, 2022

macrael Dec 12, 2022

haworku left a comment •

edited

Loading

macrael commented Dec 12, 2022

mojotalantikite commented Dec 12, 2022

[MR-2875] Bulk Downloads 502 Error #1373

[MR-2875] Bulk Downloads 502 Error #1373

Conversation

macrael commented Dec 8, 2022 • edited Loading

Summary

Related issues

haworku Dec 8, 2022 • edited Loading

Choose a reason for hiding this comment

haworku Dec 12, 2022

Choose a reason for hiding this comment

macrael Dec 12, 2022

Choose a reason for hiding this comment

haworku left a comment • edited Loading

Choose a reason for hiding this comment

macrael commented Dec 12, 2022

mojotalantikite commented Dec 12, 2022

macrael commented Dec 8, 2022 •

edited

Loading

haworku Dec 8, 2022 •

edited

Loading

haworku left a comment •

edited

Loading