Skip to content

feat: secure file upload with Supabase storage#1977

Merged
yujonglee merged 5 commits intomainfrom
devin/1764335713-secure-file-upload
Nov 29, 2025
Merged

feat: secure file upload with Supabase storage#1977
yujonglee merged 5 commits intomainfrom
devin/1764335713-secure-file-upload

Conversation

@yujonglee
Copy link
Contributor

@yujonglee yujonglee commented Nov 28, 2025

feat: secure file upload with Supabase storage

Summary

This PR addresses security concerns with the audio file upload flow by making the audio-files Supabase storage bucket private and using signed URLs instead of public URLs.

Key changes:

  • Migration: Makes audio-files bucket private and adds owner-only SELECT policy
  • Upload flow: Returns fileId (storage path) instead of public URL
  • Restate pipeline: Creates short-lived signed URLs (1 hour) for Deepgram, then deletes files after processing
  • New helper: apps/restate/src/supabase.ts with createSignedUrl() and deleteFile() functions using Supabase REST API

Security model:

  • Bucket is private (no public access)
  • Only authenticated users can upload to their own folder
  • Only owners can view/delete their files
  • Backend service uses service role key to create signed URLs for Deepgram
  • Files are automatically cleaned up after transcription completes

Updates since last revision

  • fileId validation: Added server-side validation in startAudioPipeline to ensure the fileId belongs to the authenticated user. Validates that the first path segment matches the user's ID and rejects path traversal attempts (.. segments).
  • llmResult type fix: Changed llmResult from z.unknown() to z.string() in both transcription.ts and restate.ts.
  • AudioPipeline type fix: Added context argument to local AudioPipeline type to match Restate SDK's expected signature for proper type inference. This fixes the SendOpts<unknown> TypeScript error.

Review & Testing Checklist for Human

  • Verify fileId validation logic: Check that the path segment validation in startAudioPipeline (lines 55-69 in transcription.ts) correctly prevents access to other users' files
  • Verify Supabase REST API format: The signed URL endpoint (/storage/v1/object/sign/audio-files/{fileId}) and response field (signedURL) should be validated against Supabase docs
  • Configure environment variables: SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY must be added to the Restate Cloudflare Worker deployment
  • Verify llmResult is actually a string: Confirm the LLM response is a string, not an object
  • Test full flow end-to-end: Upload audio file → verify signed URL works with Deepgram → verify file is deleted after processing

Recommended test plan:

  1. Deploy migration to staging Supabase instance
  2. Configure Restate worker with Supabase credentials
  3. Upload an audio file through the file-transcription page
  4. Verify transcription completes successfully
  5. Confirm the file is deleted from storage after processing
  6. Test with a malicious fileId (e.g., other-user-id/file.wav) to verify it's rejected

Notes

  • The transcribeAudio function in transcription.ts still accepts audioUrl - this is a separate code path that may need similar treatment if used
  • Supabase tests pass locally (supabase db test)
  • All CI checks now pass (TypeScript errors fixed)
  • Added explicit type assertion in file-transcription.tsx for the getAudioPipelineStatus response to work around client-side type inference limitations

Link to Devin run: https://app.devin.ai/sessions/711a40f05bfb4228bd0c3eab610c001e
Requested by: yujonglee (@yujonglee)

- Make audio-files bucket private (no public access)
- Return fileId instead of public URL from upload
- Pass fileId through pipeline instead of audioUrl
- Create signed URLs in Restate service for Deepgram
- Delete files from bucket after processing
- Add owner-only SELECT policy for authenticated users
- Update tests for new private bucket behavior

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR that start with 'DevinAI' or '@devin'.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@netlify
Copy link

netlify bot commented Nov 28, 2025

Deploy Preview for hyprnote-storybook ready!

Name Link
🔨 Latest commit 6c1bf1f
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote-storybook/deploys/692a442a21db460008196273
😎 Deploy Preview https://deploy-preview-1977--hyprnote-storybook.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link

netlify bot commented Nov 28, 2025

Deploy Preview for hyprnote ready!

Name Link
🔨 Latest commit 6c1bf1f
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote/deploys/692a442ad278430008990c01
😎 Deploy Preview https://deploy-preview-1977--hyprnote.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 28, 2025

📝 Walkthrough

Walkthrough

Switches the audio pipeline to accept a Supabase fileId (signed URL created at runtime), moves secret access to env, adds Supabase helpers to sign and delete files, returns fileId from upload, and makes the audio-files bucket private with owner-scoped policies.

Changes

Cohort / File(s) Summary
Audio pipeline & Supabase helpers
apps/restate/src/audioPipeline.ts, apps/restate/src/supabase.ts
Pipeline input changed from audioUrlfileId; secrets read from env; pipeline calls createSignedUrl(env, fileId) for a temporary audioUrl, uses it for Deepgram/LLM calls, and runs deleteFile(env, fileId) in a finally block; new createSignedUrl and deleteFile helpers with config validation added.
Web functions (upload & transcription)
apps/web/src/functions/upload.ts, apps/web/src/functions/transcription.ts
Upload now returns { success: true, fileId }; transcription handler, AudioPipeline.run, and validators updated to accept { fileId } and forward fileId instead of audioUrl.
Web UI & client utils
apps/web/src/routes/_view/app/file-transcription.tsx, apps/web/src/utils/restate.ts
UI and client util startAudioPipeline updated to expect/forward fileId from upload and to send fileId to the Restate API; llmResult typed as string.
Storage migration & tests
supabase/migrations/20251128131538_make_audio_files_private.sql, supabase/tests/004-storage-audio-files-policies.sql
Migration makes audio-files bucket private, drops public read policy, and adds owner-scoped SELECT policy; tests updated assertions/messages to reflect private-bucket behavior.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Web Client
    participant UploadFn as Upload Handler
    participant SupabaseStorage as Supabase Storage
    participant RestateAPI as Restate API
    participant Pipeline as Audio Pipeline
    participant SupabaseSign as Supabase (Sign URL)
    participant Deepgram as Deepgram API
    participant LLM as LLM API
    participant SupabaseDelete as Supabase (Delete)

    Client->>UploadFn: uploadAudioFile()
    UploadFn->>SupabaseStorage: store file -> returns fileId
    UploadFn-->>Client: { success: true, fileId }

    Client->>RestateAPI: startAudioPipeline({ fileId })
    RestateAPI->>Pipeline: run(ctx, { fileId })

    Pipeline->>SupabaseSign: createSignedUrl(env, fileId)
    SupabaseSign-->>Pipeline: signed audioUrl

    Pipeline->>Deepgram: transcribe(audioUrl)
    Deepgram-->>Pipeline: transcript

    Pipeline->>LLM: process(transcript)
    LLM-->>Pipeline: result

    Pipeline->>SupabaseDelete: deleteFile(env, fileId) (finally)
    SupabaseDelete-->>Pipeline: deletion result

    Pipeline-->>RestateAPI: complete/status update
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Focus areas:
    • Supabase helpers: URL composition, headers, error handling.
    • Pipeline try/finally: ensure deletion runs and non-blocking error handling.
    • Consistency across client/server/UI for the fileIdaudioUrl flow.
    • Migration and updated storage test expectations.

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: secure file upload with Supabase storage' is clear, concise, and directly summarizes the main security-focused change—making the audio-files bucket private and implementing signed URLs instead of public URLs.
Description check ✅ Passed The PR description comprehensively describes the changeset, including security improvements to the audio file upload flow, specific changes to multiple files, new helper functions, and detailed testing/validation steps.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1764335713-secure-file-upload

Comment @coderabbitai help to get the list of available commands and usage tips.

…response

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
supabase/migrations/20251128131538_make_audio_files_private.sql (1)

1-12: Bucket privatization and owner-only SELECT policy look correct

Making audio-files non-public and restricting storage.objects SELECT to authenticated users whose UID matches the first folder segment in name aligns with your upload convention (${userId}/...). Just be aware this assumes all existing objects in audio-files follow that convention; any legacy objects with different paths will no longer be visible without service-role access. If you want extra safety, you could also AND owner = auth.uid() in the policy, but it’s not strictly required given your current path scheme.

apps/web/src/functions/upload.ts (1)

36-36: Return value switch to fileId is consistent; consider basic upload validation

Returning { success: true, fileId: uploadData.path } matches the new fileId-based pipeline and looks good. As a follow-up hardening step for “secure upload”, you might want to enforce simple constraints before Buffer.from/upload (e.g. max byte size and a small MIME-type whitelist for audio) to avoid very large or unexpected uploads being accepted server-side.

apps/restate/src/supabase.ts (1)

1-81: Supabase admin helpers look solid; consider a small robustness tweak

The env validation + URL normalization in getSupabaseConfig, and the use of /storage/v1/object/sign/audio-files/{fileId} and /storage/v1/object/audio-files/{fileId} with the service-role key, all look correct for generating signed URLs and deleting objects from a private bucket.

To make createSignedUrl more future-proof across Supabase variants, you might consider accepting both signedURL (relative) and signedUrl (absolute) from the response, e.g.:

const data = (await response.json()) as { signedURL?: string; signedUrl?: string };
const raw = data.signedUrl ?? data.signedURL;
if (!raw) throw new Error("Signed URL not returned from Supabase");
if (raw.startsWith("http")) return raw;
return `${url}/storage/v1${raw}`;

This keeps your worker logic resilient even if the storage API response shape changes slightly between hosted vs. self-hosted environments.

apps/web/src/routes/_view/app/file-transcription.tsx (1)

108-116: UI now correctly propagates fileId into the pipeline

The guard on "fileId" in uploadResult and passing fileId: uploadResult.fileId into startAudioPipeline are aligned with the new server contracts. Optionally, you could narrow on a success flag instead of using "fileId" in ... to get slightly stronger TypeScript discrimination, but the current pattern is functionally fine.

apps/web/src/utils/restate.ts (1)

31-41: Restate client payload switch to fileId matches the workflow API

Updating startAudioPipeline to require fileId and sending { userId, fileId } in the body is consistent with the StartAudioPipeline schema in the worker. Just keep in mind this helper (and RESTATE_INGRESS_URL) are shipped to the browser, so if you ever want the Restate ingress to be private, you’d route this call through a server function instead.

apps/restate/src/audioPipeline.ts (1)

6-12: Signed-URL workflow and cleanup are well-structured; optional defense-in-depth

Switching StartAudioPipeline to fileId, generating a Supabase signed URL inside the workflow, and then deleting the file in a finally block is a solid pattern: Deepgram sees only a time-limited URL, and storage is cleaned up even on most failures. The explicit env checks with restate.TerminalError also make misconfiguration fail fast.

Once you enforce the fileId–user binding in the web startAudioPipeline handler, this worker code is in good shape. If you want extra defense-in-depth on the server side, you could additionally assert that req.fileId follows your expected ${userId}/... convention here (and fail early) so that even direct calls to the workflow can’t be pointed at another user’s folder.

Also applies to: 36-52, 129-152, 161-186, 191-193, 202-203, 207-207, 231-241

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6bdc793 and 48faf32.

📒 Files selected for processing (8)
  • apps/restate/src/audioPipeline.ts (5 hunks)
  • apps/restate/src/supabase.ts (1 hunks)
  • apps/web/src/functions/transcription.ts (3 hunks)
  • apps/web/src/functions/upload.ts (1 hunks)
  • apps/web/src/routes/_view/app/file-transcription.tsx (1 hunks)
  • apps/web/src/utils/restate.ts (1 hunks)
  • supabase/migrations/20251128131538_make_audio_files_private.sql (1 hunks)
  • supabase/tests/004-storage-audio-files-policies.sql (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.ts: Agent implementations should use TypeScript and follow the established architectural patterns defined in the agent framework
Agent communication should use defined message protocols and interfaces

Files:

  • apps/web/src/utils/restate.ts
  • apps/web/src/functions/upload.ts
  • apps/restate/src/audioPipeline.ts
  • apps/web/src/functions/transcription.ts
  • apps/restate/src/supabase.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Avoid creating a bunch of types/interfaces if they are not shared. Especially for function props, just inline them instead.
Never do manual state management for form/mutation. Use useForm (from tanstack-form) and useQuery/useMutation (from tanstack-query) instead for 99% of cases. Avoid patterns like setError.
If there are many classNames with conditional logic, use cn (import from @hypr/utils). It is similar to clsx. Always pass an array and split by logical grouping.
Use motion/react instead of framer-motion.

Files:

  • apps/web/src/utils/restate.ts
  • apps/web/src/routes/_view/app/file-transcription.tsx
  • apps/web/src/functions/upload.ts
  • apps/restate/src/audioPipeline.ts
  • apps/web/src/functions/transcription.ts
  • apps/restate/src/supabase.ts
🧬 Code graph analysis (3)
apps/web/src/routes/_view/app/file-transcription.tsx (2)
apps/web/src/functions/transcription.ts (1)
  • startAudioPipeline (38-70)
apps/web/src/utils/restate.ts (1)
  • startAudioPipeline (31-45)
apps/restate/src/audioPipeline.ts (1)
apps/restate/src/supabase.ts (2)
  • createSignedUrl (23-58)
  • deleteFile (60-81)
apps/web/src/functions/transcription.ts (1)
apps/restate/src/audioPipeline.ts (1)
  • StatusStateType (34-34)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Redirect rules - hyprnote
  • GitHub Check: Header rules - hyprnote
  • GitHub Check: Pages changed - hyprnote
  • GitHub Check: ci (macos, macos-14)
  • GitHub Check: fmt
  • GitHub Check: Database tests
🔇 Additional comments (1)
supabase/tests/004-storage-audio-files-policies.sql (1)

15-37: Tests correctly reflect new private-bucket RLS semantics

The expectations that the owner sees 1 row and another authenticated user sees 0 rows for bucket_id = 'audio-files' are consistent with the new owner-only SELECT policy and will guard against regressions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
apps/web/src/functions/transcription.ts (1)

55-60: Validate that fileId belongs to the authenticated user before starting the pipeline

startAudioPipeline currently trusts data.fileId from the client and passes it into a workflow that uses a Supabase service-role key to access storage. A malicious caller who can guess another user’s object key could have the pipeline process someone else’s audio under their own userId.

Before submitting the workflow, bind/validate the file against the authenticated user, e.g. assuming keys are ${userId}/${...}:

   .handler(async ({ data }) => {
     const supabase = getSupabaseServerClient();
     const { data: userData } = await supabase.auth.getUser();

     if (!userData.user) {
       return { error: true, message: "Unauthorized" };
     }

+    // Ensure the fileId belongs to the current user before using it in a service‑role flow
+    const ownerFromFileId = data.fileId.split("/", 1)[0];
+    if (ownerFromFileId !== userData.user.id) {
+      return { error: true, message: "Invalid file ID for current user" };
+    }
+
     const pipelineId = data.pipelineId ?? crypto.randomUUID();

     try {
       const restateClient = getRestateClient();
       const handle = await restateClient
         .workflowClient<AudioPipeline>({ name: "AudioPipeline" }, pipelineId)
-        .workflowSubmit({ userId: userData.user.id, fileId: data.fileId });
+        .workflowSubmit({ userId: userData.user.id, fileId: data.fileId });

This keeps the workflow from ever targeting an object outside the caller’s own folder while still allowing the worker to use a service-role key.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 48faf32 and 48a98f4.

📒 Files selected for processing (2)
  • apps/web/src/functions/transcription.ts (3 hunks)
  • apps/web/src/routes/_view/app/file-transcription.tsx (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/web/src/routes/_view/app/file-transcription.tsx
🧰 Additional context used
📓 Path-based instructions (2)
**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.ts: Agent implementations should use TypeScript and follow the established architectural patterns defined in the agent framework
Agent communication should use defined message protocols and interfaces

Files:

  • apps/web/src/functions/transcription.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx}: Avoid creating a bunch of types/interfaces if they are not shared. Especially for function props, just inline them instead.
Never do manual state management for form/mutation. Use useForm (from tanstack-form) and useQuery/useMutation (from tanstack-query) instead for 99% of cases. Avoid patterns like setError.
If there are many classNames with conditional logic, use cn (import from @hypr/utils). It is similar to clsx. Always pass an array and split by logical grouping.
Use motion/react instead of framer-motion.

Files:

  • apps/web/src/functions/transcription.ts
🧬 Code graph analysis (1)
apps/web/src/functions/transcription.ts (2)
apps/restate/src/audioPipeline.ts (2)
  • StatusStateType (34-34)
  • AudioPipeline (287-287)
apps/web/src/utils/restate.ts (1)
  • StatusState (4-15)
🪛 GitHub Actions: .github/workflows/web_ci.yaml
apps/web/src/functions/transcription.ts

[error] 23-23: src/functions/transcription.ts(23,16): error TS2554: Expected 2-3 arguments, but got 1.

🔇 Additional comments (2)
apps/web/src/functions/transcription.ts (2)

29-32: AudioPipeline run signature change looks consistent

The AudioPipeline type update to { userId: string; fileId: string } matches the new storage model and the payload you submit in workflowSubmit. No issues here.


38-45: Input schema change to fileId is aligned, but double‑check callers

Switching the input validator from audioUrl to fileId is consistent with the new private-bucket approach. Just ensure all callers of startAudioPipeline have been updated to pass fileId (and no remaining code relies on the old audioUrl field for this path).

devin-ai-integration bot and others added 3 commits November 28, 2025 13:46
Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
…DK type inference

Co-Authored-By: yujonglee <yujonglee.dev@gmail.com>
@argos-ci
Copy link

argos-ci bot commented Nov 29, 2025

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
web (Inspect) ⚠️ Changes detected (Review) 2 changed Nov 29, 2025, 12:57 AM

@yujonglee yujonglee merged commit 3fd5d3b into main Nov 29, 2025
12 of 13 checks passed
@yujonglee yujonglee deleted the devin/1764335713-secure-file-upload branch November 29, 2025 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant