-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor file upload for content fragment #5953
Conversation
"image/jpeg", | ||
"image/png", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this, which gates vision before shipping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems non homogeneous. We should standardize on content type or file extension. Do browsers accept both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this was for test purpose, will be reworked in the final PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments.
I thiiiink this PR needs to be split if we want to make forward progress as it includes things that needs to be discussed separately:
- A new file upload endpoint whose details needs to be discussed in details: do we support streaming? do we support transformation? what API do we want for this?
- Move content fragment to the new file upload. This one is quite straightforward and will be much easier to review once the file upload endpoint is cut and solid
- Introduce image attachments which will require further separate discussions
Happy to have all of them here, but I'm not sure it's the most efficent :)
"image/jpeg", | ||
"image/png", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems non homogeneous. We should standardize on content type or file extension. Do browsers accept both?
@@ -122,26 +122,6 @@ export async function submitMessage({ | |||
); | |||
}) | |||
); | |||
|
|||
for (const [i, mcfRes] of contentFragmentsRes.entries()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why stop looking for errors?
@@ -110,7 +110,7 @@ export async function submitMessage({ | |||
body: JSON.stringify({ | |||
title: contentFragment.title, | |||
content: contentFragment.content, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is content still pushed here?
* Text files handling. | ||
*/ | ||
|
||
export function isSupportedTextMimeType(file: formidable.File): boolean { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid these type of functions which are just a wrapper on another function but also only used once makes understanding the code harder not simpler IMHO
} | ||
|
||
switch (req.method) { | ||
case "POST": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This endpoint needs to be aggressively rate-limited
@@ -141,13 +145,65 @@ export function useFileUploaderService() { | |||
const results = await Promise.all(previewPromises); | |||
|
|||
setFileBlobs([...fileBlobs, ...results]); | |||
|
|||
for (const fileBlob of results) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be a Promise.all here or at least have some form of parallelization as we def don't want to wait for a sequential upload?
}); | ||
|
||
if (!uploadResult.ok) { | ||
throw new FileBlobUploadError("failed_to_upload_file", fileBlob.file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No try catch of our own errors
}); | ||
} | ||
|
||
setIsProcessingFiles(false); | ||
} catch (err) { | ||
if (err instanceof FileBlobUploadError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use a Result
await resizeAndUploadToFileStorage(owner, fileTokenPayload, file); | ||
} else if (isSupportedTextMimeType(file)) { | ||
// TODO:(2026-06-28 flav) Move logic to extract text from PDF here. | ||
await uploadToFileStorage(owner, fileTokenPayload, file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're building a new file API we should make sure to support something forward looking. This downloads the entire file in memory before uploading right?
If not: since we're cutting a new API we want to make sure it supports proper streaming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It creates interesting questions on how we want to handle that: we can't stream and transform at the same time. Which hints that we likely want a file status API and an asyncrhonous processing mechanism run server side to resize images or extract PDFs.
|
||
try { | ||
const form = new IncomingForm(); | ||
const [, files] = await form.parse(req); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this materialize the file in memory? We'd want to stream to GCS instead here no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the file is not in memory but written on the disk. It's then exposed as a stream that we can handle. We could avoid saving anything on the disk by simply streaming straight away to GCS but this means that we can't do any pre-processing before saving.
Description
Summary
While working on vision, we identified a race condition in the process of uploading content fragment files. Currently, we create the conversation/post the message with the content fragment before uploading the associated file, which causes issues. As we plan to use URLs for vision and need to resize images (increasing upload time), the existing logic needs an refactoring.
Key Changes
This PR revamps the current logic, separating files from content fragments. While files will still be used within content fragments, they can now exist independently. The goal is to achieve this separation without storing every file in a database. The main changes include:
/files
: This endpoint must be called first, allowing the client to obtain a file ID. The endpoint returns the file ID and a signed (using JWT) upload URL valid for 30 seconds./files/:fileId
: This endpoint supports both file upload and retrieval. For uploads, the client must provide the token to verify that the file ID was generated by our system. While it currently doesn't process textual files, it does resize images. Once the file is uploaded, the endpoint returns a download URL, which can be used in theurl
field of a content fragment.The
useFileUploaderService
has been adjusted to accommodate these changes. Previously, no special handling was done until files were submitted. With the new approach, files are uploaded to our cloud storage as soon as they are uploaded to the system, unblocking the send button only after the upload and resizing are complete. This avoids impacting the chat experience.Impact on Production
Trade-offs and Future Considerations
This solution has some trade-offs, such as the inability to recompute file IDs and the loss of the current GCS structure of conversation/message/content. There is also the potential for "ghost" files to remain in the system if they are added to the input bar but never sent. However, these drawbacks are acceptable given the increased flexibility needed for future file handling enhancements.
Note on Public API
These changes do not yet apply to the public API but will be rolled out once vision becomes generally available.
Image Resizing
Based on experiments with vision, we plan to resize the largest side of images down to 768px.
Follow up work
Demos
Risk
Worst case, file upload is broken.
Deploy Plan