feat: secure file upload with Supabase storage#1977
Conversation
- Make audio-files bucket private (no public access) - Return fileId instead of public URL from upload - Pass fileId through pipeline instead of audioUrl - Create signed URLs in Restate service for Deepgram - Delete files from bucket after processing - Add owner-only SELECT policy for authenticated users - Update tests for new private bucket behavior Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
✅ Deploy Preview for hyprnote-storybook ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for hyprnote ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughSwitches the audio pipeline to accept a Supabase Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Web Client
participant UploadFn as Upload Handler
participant SupabaseStorage as Supabase Storage
participant RestateAPI as Restate API
participant Pipeline as Audio Pipeline
participant SupabaseSign as Supabase (Sign URL)
participant Deepgram as Deepgram API
participant LLM as LLM API
participant SupabaseDelete as Supabase (Delete)
Client->>UploadFn: uploadAudioFile()
UploadFn->>SupabaseStorage: store file -> returns fileId
UploadFn-->>Client: { success: true, fileId }
Client->>RestateAPI: startAudioPipeline({ fileId })
RestateAPI->>Pipeline: run(ctx, { fileId })
Pipeline->>SupabaseSign: createSignedUrl(env, fileId)
SupabaseSign-->>Pipeline: signed audioUrl
Pipeline->>Deepgram: transcribe(audioUrl)
Deepgram-->>Pipeline: transcript
Pipeline->>LLM: process(transcript)
LLM-->>Pipeline: result
Pipeline->>SupabaseDelete: deleteFile(env, fileId) (finally)
SupabaseDelete-->>Pipeline: deletion result
Pipeline-->>RestateAPI: complete/status update
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
…response Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
supabase/migrations/20251128131538_make_audio_files_private.sql (1)
1-12: Bucket privatization and owner-only SELECT policy look correctMaking
audio-filesnon-public and restrictingstorage.objectsSELECT toauthenticatedusers whose UID matches the first folder segment innamealigns with your upload convention (${userId}/...). Just be aware this assumes all existing objects inaudio-filesfollow that convention; any legacy objects with different paths will no longer be visible without service-role access. If you want extra safety, you could also ANDowner = auth.uid()in the policy, but it’s not strictly required given your current path scheme.apps/web/src/functions/upload.ts (1)
36-36: Return value switch tofileIdis consistent; consider basic upload validationReturning
{ success: true, fileId: uploadData.path }matches the new fileId-based pipeline and looks good. As a follow-up hardening step for “secure upload”, you might want to enforce simple constraints beforeBuffer.from/upload(e.g. max byte size and a small MIME-type whitelist for audio) to avoid very large or unexpected uploads being accepted server-side.apps/restate/src/supabase.ts (1)
1-81: Supabase admin helpers look solid; consider a small robustness tweakThe env validation + URL normalization in
getSupabaseConfig, and the use of/storage/v1/object/sign/audio-files/{fileId}and/storage/v1/object/audio-files/{fileId}with the service-role key, all look correct for generating signed URLs and deleting objects from a private bucket.To make
createSignedUrlmore future-proof across Supabase variants, you might consider accepting bothsignedURL(relative) andsignedUrl(absolute) from the response, e.g.:const data = (await response.json()) as { signedURL?: string; signedUrl?: string }; const raw = data.signedUrl ?? data.signedURL; if (!raw) throw new Error("Signed URL not returned from Supabase"); if (raw.startsWith("http")) return raw; return `${url}/storage/v1${raw}`;This keeps your worker logic resilient even if the storage API response shape changes slightly between hosted vs. self-hosted environments.
apps/web/src/routes/_view/app/file-transcription.tsx (1)
108-116: UI now correctly propagatesfileIdinto the pipelineThe guard on
"fileId" in uploadResultand passingfileId: uploadResult.fileIdintostartAudioPipelineare aligned with the new server contracts. Optionally, you could narrow on asuccessflag instead of using"fileId" in ...to get slightly stronger TypeScript discrimination, but the current pattern is functionally fine.apps/web/src/utils/restate.ts (1)
31-41: Restate client payload switch tofileIdmatches the workflow APIUpdating
startAudioPipelineto requirefileIdand sending{ userId, fileId }in the body is consistent with theStartAudioPipelineschema in the worker. Just keep in mind this helper (andRESTATE_INGRESS_URL) are shipped to the browser, so if you ever want the Restate ingress to be private, you’d route this call through a server function instead.apps/restate/src/audioPipeline.ts (1)
6-12: Signed-URL workflow and cleanup are well-structured; optional defense-in-depthSwitching
StartAudioPipelinetofileId, generating a Supabase signed URL inside the workflow, and then deleting the file in afinallyblock is a solid pattern: Deepgram sees only a time-limited URL, and storage is cleaned up even on most failures. The explicit env checks withrestate.TerminalErroralso make misconfiguration fail fast.Once you enforce the
fileId–user binding in the webstartAudioPipelinehandler, this worker code is in good shape. If you want extra defense-in-depth on the server side, you could additionally assert thatreq.fileIdfollows your expected${userId}/...convention here (and fail early) so that even direct calls to the workflow can’t be pointed at another user’s folder.Also applies to: 36-52, 129-152, 161-186, 191-193, 202-203, 207-207, 231-241
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
apps/restate/src/audioPipeline.ts(5 hunks)apps/restate/src/supabase.ts(1 hunks)apps/web/src/functions/transcription.ts(3 hunks)apps/web/src/functions/upload.ts(1 hunks)apps/web/src/routes/_view/app/file-transcription.tsx(1 hunks)apps/web/src/utils/restate.ts(1 hunks)supabase/migrations/20251128131538_make_audio_files_private.sql(1 hunks)supabase/tests/004-storage-audio-files-policies.sql(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.ts: Agent implementations should use TypeScript and follow the established architectural patterns defined in the agent framework
Agent communication should use defined message protocols and interfaces
Files:
apps/web/src/utils/restate.tsapps/web/src/functions/upload.tsapps/restate/src/audioPipeline.tsapps/web/src/functions/transcription.tsapps/restate/src/supabase.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx}: Avoid creating a bunch of types/interfaces if they are not shared. Especially for function props, just inline them instead.
Never do manual state management for form/mutation. Use useForm (from tanstack-form) and useQuery/useMutation (from tanstack-query) instead for 99% of cases. Avoid patterns like setError.
If there are many classNames with conditional logic, usecn(import from@hypr/utils). It is similar toclsx. Always pass an array and split by logical grouping.
Usemotion/reactinstead offramer-motion.
Files:
apps/web/src/utils/restate.tsapps/web/src/routes/_view/app/file-transcription.tsxapps/web/src/functions/upload.tsapps/restate/src/audioPipeline.tsapps/web/src/functions/transcription.tsapps/restate/src/supabase.ts
🧬 Code graph analysis (3)
apps/web/src/routes/_view/app/file-transcription.tsx (2)
apps/web/src/functions/transcription.ts (1)
startAudioPipeline(38-70)apps/web/src/utils/restate.ts (1)
startAudioPipeline(31-45)
apps/restate/src/audioPipeline.ts (1)
apps/restate/src/supabase.ts (2)
createSignedUrl(23-58)deleteFile(60-81)
apps/web/src/functions/transcription.ts (1)
apps/restate/src/audioPipeline.ts (1)
StatusStateType(34-34)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Redirect rules - hyprnote
- GitHub Check: Header rules - hyprnote
- GitHub Check: Pages changed - hyprnote
- GitHub Check: ci (macos, macos-14)
- GitHub Check: fmt
- GitHub Check: Database tests
🔇 Additional comments (1)
supabase/tests/004-storage-audio-files-policies.sql (1)
15-37: Tests correctly reflect new private-bucket RLS semanticsThe expectations that the owner sees 1 row and another authenticated user sees 0 rows for
bucket_id = 'audio-files'are consistent with the new owner-only SELECT policy and will guard against regressions.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
apps/web/src/functions/transcription.ts (1)
55-60: Validate thatfileIdbelongs to the authenticated user before starting the pipeline
startAudioPipelinecurrently trustsdata.fileIdfrom the client and passes it into a workflow that uses a Supabase service-role key to access storage. A malicious caller who can guess another user’s object key could have the pipeline process someone else’s audio under their ownuserId.Before submitting the workflow, bind/validate the file against the authenticated user, e.g. assuming keys are
${userId}/${...}:.handler(async ({ data }) => { const supabase = getSupabaseServerClient(); const { data: userData } = await supabase.auth.getUser(); if (!userData.user) { return { error: true, message: "Unauthorized" }; } + // Ensure the fileId belongs to the current user before using it in a service‑role flow + const ownerFromFileId = data.fileId.split("/", 1)[0]; + if (ownerFromFileId !== userData.user.id) { + return { error: true, message: "Invalid file ID for current user" }; + } + const pipelineId = data.pipelineId ?? crypto.randomUUID(); try { const restateClient = getRestateClient(); const handle = await restateClient .workflowClient<AudioPipeline>({ name: "AudioPipeline" }, pipelineId) - .workflowSubmit({ userId: userData.user.id, fileId: data.fileId }); + .workflowSubmit({ userId: userData.user.id, fileId: data.fileId });This keeps the workflow from ever targeting an object outside the caller’s own folder while still allowing the worker to use a service-role key.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
apps/web/src/functions/transcription.ts(3 hunks)apps/web/src/routes/_view/app/file-transcription.tsx(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/web/src/routes/_view/app/file-transcription.tsx
🧰 Additional context used
📓 Path-based instructions (2)
**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.ts: Agent implementations should use TypeScript and follow the established architectural patterns defined in the agent framework
Agent communication should use defined message protocols and interfaces
Files:
apps/web/src/functions/transcription.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.{ts,tsx}: Avoid creating a bunch of types/interfaces if they are not shared. Especially for function props, just inline them instead.
Never do manual state management for form/mutation. Use useForm (from tanstack-form) and useQuery/useMutation (from tanstack-query) instead for 99% of cases. Avoid patterns like setError.
If there are many classNames with conditional logic, usecn(import from@hypr/utils). It is similar toclsx. Always pass an array and split by logical grouping.
Usemotion/reactinstead offramer-motion.
Files:
apps/web/src/functions/transcription.ts
🧬 Code graph analysis (1)
apps/web/src/functions/transcription.ts (2)
apps/restate/src/audioPipeline.ts (2)
StatusStateType(34-34)AudioPipeline(287-287)apps/web/src/utils/restate.ts (1)
StatusState(4-15)
🪛 GitHub Actions: .github/workflows/web_ci.yaml
apps/web/src/functions/transcription.ts
[error] 23-23: src/functions/transcription.ts(23,16): error TS2554: Expected 2-3 arguments, but got 1.
🔇 Additional comments (2)
apps/web/src/functions/transcription.ts (2)
29-32: AudioPipeline run signature change looks consistentThe
AudioPipelinetype update to{ userId: string; fileId: string }matches the new storage model and the payload you submit inworkflowSubmit. No issues here.
38-45: Input schema change tofileIdis aligned, but double‑check callersSwitching the input validator from
audioUrltofileIdis consistent with the new private-bucket approach. Just ensure all callers ofstartAudioPipelinehave been updated to passfileId(and no remaining code relies on the oldaudioUrlfield for this path).
Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
…DK type inference Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
|
The latest updates on your projects. Learn more about Argos notifications ↗︎
|
feat: secure file upload with Supabase storage
Summary
This PR addresses security concerns with the audio file upload flow by making the
audio-filesSupabase storage bucket private and using signed URLs instead of public URLs.Key changes:
audio-filesbucket private and adds owner-only SELECT policyfileId(storage path) instead of public URLapps/restate/src/supabase.tswithcreateSignedUrl()anddeleteFile()functions using Supabase REST APISecurity model:
Updates since last revision
startAudioPipelineto ensure thefileIdbelongs to the authenticated user. Validates that the first path segment matches the user's ID and rejects path traversal attempts (..segments).llmResultfromz.unknown()toz.string()in bothtranscription.tsandrestate.ts.AudioPipelinetype to match Restate SDK's expected signature for proper type inference. This fixes theSendOpts<unknown>TypeScript error.Review & Testing Checklist for Human
startAudioPipeline(lines 55-69 in transcription.ts) correctly prevents access to other users' files/storage/v1/object/sign/audio-files/{fileId}) and response field (signedURL) should be validated against Supabase docsSUPABASE_URLandSUPABASE_SERVICE_ROLE_KEYmust be added to the Restate Cloudflare Worker deploymentRecommended test plan:
other-user-id/file.wav) to verify it's rejectedNotes
transcribeAudiofunction intranscription.tsstill acceptsaudioUrl- this is a separate code path that may need similar treatment if usedsupabase db test)file-transcription.tsxfor thegetAudioPipelineStatusresponse to work around client-side type inference limitationsLink to Devin run: https://app.devin.ai/sessions/711a40f05bfb4228bd0c3eab610c001e
Requested by: yujonglee (@yujonglee)