-
-
Notifications
You must be signed in to change notification settings - Fork 312
Fix running e2e backend #2710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/e2e-backend
Are you sure you want to change the base?
Fix running e2e backend #2710
Conversation
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds E2E configuration and SITE_URL, inserts an IS_E2E_ENVIRONMENT branch in API settings customization, introduces a dump_data Django management command to produce masked SQL dumps, updates Makefile and CI workflows to load SQL after backend start with timeouts, and adjusts env examples, lint rules, and wordlist entries. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The PR must be linked to an issue assigned to the PR author. |
|
The PR must be linked to an issue assigned to the PR author. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/apps/common/utils.py (1)
71-76: Consider adding type hints for consistency.Other functions in this file include type hints (e.g.,
convert_to_camel_case(text: str) -> str). Consider adding type hints to match the existing code style.-def csrf_decorate(view): +from typing import Callable +from django.http import HttpRequest, HttpResponse + +def csrf_decorate(view: Callable[[HttpRequest], HttpResponse]) -> Callable[[HttpRequest], HttpResponse]: """Apply CSRF protection based on settings.""" if settings.IS_E2E_ENVIRONMENT: return csrf_exempt(view) # NOSONAR return csrf_protect(view)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/apps/common/utils.py(2 hunks)backend/settings/urls.py(2 hunks)docker-compose/e2e.yaml(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-31T13:48:09.830Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2155
File: frontend/graphql-codegen.ts:0-0
Timestamp: 2025-08-31T13:48:09.830Z
Learning: In the OWASP/Nest project, Django's GraphQL endpoint accepts CSRF tokens via 'x-csrftoken' header (lowercase) without requiring a Referer header, working fine in their configuration for GraphQL codegen introspection.
Applied to files:
backend/settings/urls.py
🧬 Code graph analysis (1)
backend/settings/urls.py (1)
backend/apps/common/utils.py (1)
csrf_decorate(71-75)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run backend tests
- GitHub Check: Run frontend unit tests
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (4)
docker-compose/e2e.yaml (1)
15-20: LGTM! DB environment variables properly configured.Adding explicit database connection environment variables ensures the backend service can connect to the PostgreSQL database. The values correctly reference the
dbservice name for the host and align with the database service configuration (lines 38-40). This explicit configuration takes precedence over the env_file and likely resolves the internal server error mentioned in the linked issue.backend/settings/urls.py (2)
14-14: LGTM! Import correctly references the new utility.The import is correctly placed and the
csrf_decorateutility is appropriately used for both the Algolia search and GraphQL endpoints.
24-25: Implementation correctly applies conditional CSRF protection.The routes now use
csrf_decoratewhich conditionally applies CSRF protection based on the environment. In production, these endpoints remain protected. Security implications of the E2E exemption are addressed in the review ofbackend/apps/common/utils.py.Based on learnings, the GraphQL endpoint accepts CSRF tokens via the 'x-csrftoken' header, which continues to work with this implementation in non-E2E environments.
backend/apps/common/utils.py (1)
71-76: Verification confirms safeguards already in place—concern is substantially mitigated.The architecture provides multiple layers of protection:
- Defensive default:
IS_E2E_ENVIRONMENT = Falseinbackend/settings/base.py- Explicit opt-in: E2E config requires
DJANGO_CONFIGURATION=E2Eenvironment variable (not set in production deployment)- Separate deployment: E2E environments use distinct
.env.e2e.exampleand separate docker-compose files; production usesproduction_nest-app-networkThe code pattern itself is sound—settings default to False and require explicit configuration to enable E2E behavior. The residual risk (human misconfiguration of
DJANGO_CONFIGURATION) is addressed through deployment infrastructure and separate environment configurations.
CONTRIBUTING.md
Outdated
|
|
||
| **Please note that you only need to do these steps once.** | ||
|
|
||
| #### Updating e2e Test Data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My data file update comment was not about the instructions for the data loading. It was about how to maintain the nest-e2e..sql.gz file.
backend/apps/common/utils.py
Outdated
| return url.strip().rstrip(".,;:!?") or None | ||
|
|
||
|
|
||
| def csrf_decorate(view): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to avoid settings based on the running environment? We want it to be as close to production as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last time I test sending fuzzing request in the fuzzing PR, it didn't work even with setting csrf header, but it was before a dedicated backend instance for e2e. I will remove this for now and try again in the fuzzing PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/.env.e2e.example (1)
7-13: Consider orderingDJANGO_DB_*variables alphabetically.The change is functionally correct and aligns with the PR objectives. However,
DJANGO_DB_HOSTat line 9 is not in alphabetical order relative to other keys. Reordering these variables alphabetically would improve consistency and satisfy dotenv-linter's expectations.Apply this diff to reorder the DB-related variables alphabetically:
DJANGO_SETTINGS_MODULE=settings.e2e DJANGO_CONFIGURATION=E2E -DJANGO_DB_HOST=db DJANGO_DB_NAME=nest_db_e2e +DJANGO_DB_HOST=db DJANGO_DB_USER=nest_user_e2e DJANGO_DB_PASSWORD=nest_user_e2e_password DJANGO_DB_PORT=5432Note: If your project uses dotenv-linter or similar tools in CI, you may want to apply this reorganization to avoid future warnings.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/setup-e2e-environment/action.yaml(3 hunks)CONTRIBUTING.md(1 hunks)backend/.env.e2e.example(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/setup-e2e-environment/action.yaml
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-17T15:25:55.624Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2431
File: infrastructure/providers.tf:1-3
Timestamp: 2025-10-17T15:25:55.624Z
Learning: The infrastructure code in the OWASP/Nest repository (infrastructure/ directory) is intended for quick testing purposes only, not for production deployment.
Applied to files:
CONTRIBUTING.md
📚 Learning: 2025-11-23T11:52:15.463Z
Learnt from: rudransh-shrivastava
Repo: OWASP/Nest PR: 2699
File: backend/wsgi.py:13-13
Timestamp: 2025-11-23T11:52:15.463Z
Learning: In the OWASP Nest project, the SSM parameter store setup in backend/wsgi.py (using boto3 to fetch parameters from AWS Systems Manager) is designed for staging and production environments, not just for testing purposes.
Applied to files:
backend/.env.e2e.example
🪛 dotenv-linter (4.0.0)
backend/.env.e2e.example
[warning] 9-9: [UnorderedKey] The DJANGO_DB_HOST key should go before the DJANGO_SETTINGS_MODULE key
(UnorderedKey)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run frontend unit tests
- GitHub Check: Run backend tests
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (2)
CONTRIBUTING.md (2)
444-470: Ensure the e2e data loading procedure aligns with the linked issue objectives.The new "Updating e2e Test Data" section documents the procedure for refreshing e2e test data. Per the PR objective to "resolve the e2e backend instance error," verify that these steps fully address the data setup requirements and are consistent with any Docker Compose or backend setup changes made elsewhere in this PR.
A past reviewer flagged concerns about maintaining the
nest-e2e..sql.gzdata file itself. Confirm that the current documentation provides sufficient guidance on how to update the source data file (if applicable) before reloading into the container, rather than just the container lifecycle steps.
444-470: Let me retry the shell script execution to verify these issues:
<function_calls>
#!/bin/bashCheck the exact indentation on line 469 in CONTRIBUTING.md
echo "=== Checking line 469 indentation in CONTRIBUTING.md ==="
sed -n '465,475p' CONTRIBUTING.md | cat -Aecho ""
echo "=== Finding docker-compose files ==="
find . -maxdepth 3 -type f ( -name "docker-compose.yml" -o -name "docker-compose.yaml" -o -name ".env*" ) 2>/dev/null | head -20
</function_calls>#!/bin/bash # Search for e2e database volume definitions echo "=== Searching for e2e-db volume definitions ===" rg -n "e2e.*db.*volume|nest-e2e" --type yaml --type yml -C 3 </parameter> </invoke> </function_calls> </blockquote></details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/Makefile (1)
69-71: Clarify the relationship betweendump-data-localanddump-data-e2etargets.Both
dump-data-localanddump-data-e2ewrite to the same output file (backend/data/nest-e2e.sql.gz), which could be confusing for developers. The naming suggests these serve different purposes, but they're actually part of the same workflow (local data → e2e seed data).Consider renaming either the target or output file to make the intent clearer:
- Option 1: Rename the output to
nest-e2e-local-seed.sql.gzto indicate it's local data prepared for e2e- Option 2: Rename the target to
prepare-e2e-data-from-localto clarify its purpose- Option 3: Add a brief comment above the target explaining the workflow
The PR description mentions updated documentation on updating e2e DB data—ensure the docs clarify this workflow.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/Makefile(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
📚 Learning: 2025-10-26T12:50:50.512Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
Applied to files:
backend/Makefile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run backend tests
- GitHub Check: Run frontend unit tests
- GitHub Check: CodeQL (javascript-typescript)
backend/Makefile
Outdated
| @echo "Dumping Nest e2e data" | ||
| @CMD="pg_dumpall -U nest_user_e2e --clean | gzip -9 > backend/data/nest-e2e.sql.gz" $(MAKE) exec-db-command-e2e | ||
|
|
||
| dump-data-local: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's look at this one more time. Are we going to use our full data snapshot for e2e testing (e.g. the entire data from nest.json.gz)? If so I'd prefer to keep it simple and have just a single data file -- without nest-e2e.sql.gz introduction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can use nest.sql.gz instead of nest.json.gz. As it is extremely faster to load.
backend/apps/api/decorators/cache.py
Outdated
| def decorator(view_func): | ||
| @wraps(view_func) | ||
| def _wrapper(request, *args, **kwargs): | ||
| if settings.IS_E2E_ENVIRONMENT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why e2e needs this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't setup cache for e2e, so when it tries to access cache it gives 500 internal error. I think there is an option to setup redis cache in CI/CD. Maybe we can do that or keep it simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's add cache service for e2e instead. The closer to production architecture the better -- for both local and CI/CD cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I will add the cache in another PR after this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's configure the cache backed for e2e via Django settings (locmem for now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
backend/apps/api/rest/v0/__init__.py (1)
61-71: Consider combining duplicate environment configurations.The E2E configuration (lines 61-71) is nearly identical to the local configuration (lines 49-59), differing only in the server description. This duplication can be reduced for better maintainability.
Apply this diff to combine the configurations:
-if settings.IS_LOCAL_ENVIRONMENT: +if settings.IS_LOCAL_ENVIRONMENT or settings.IS_E2E_ENVIRONMENT: + description = "E2E" if settings.IS_E2E_ENVIRONMENT else "Local" api_settings_customization = { "auth": None, "servers": [ { - "description": "Local", + "description": description, "url": settings.SITE_URL, } ], "throttle": [], } - -elif settings.IS_E2E_ENVIRONMENT: - api_settings_customization = { - "auth": None, - "servers": [ - { - "description": "E2E", - "url": settings.SITE_URL, - } - ], - "throttle": [], - } - -elif settings.IS_STAGING_ENVIRONMENT: + +elif settings.IS_STAGING_ENVIRONMENT:
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
backend/data/nest-e2e.sql.gzis excluded by!**/*.gzbackend/data/nest.sql.gzis excluded by!**/*.gz
📒 Files selected for processing (5)
.github/workflows/setup-e2e-environment/action.yaml(4 hunks)CONTRIBUTING.md(1 hunks)backend/Makefile(2 hunks)backend/apps/api/rest/v0/__init__.py(1 hunks)backend/data/dump.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- CONTRIBUTING.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
📚 Learning: 2025-10-26T12:50:50.512Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
Applied to files:
.github/workflows/setup-e2e-environment/action.yamlbackend/Makefile
🪛 GitHub Actions: Run CI/CD
backend/data/dump.sh
[error] 3-3: CSpell: Unknown word (PGPASSWORD)
[error] 13-13: CSpell: Unknown word (Atqc)
[error] 15-15: CSpell: Unknown word (nspname)
[error] 23-23: CSpell: Unknown word (attisdropped)
[error] 24-24: CSpell: Unknown word (nspname)
[error] CSpell: 5 unknown words found in 1 file. CSpell check failed; verify spelling or update dictionary.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (4)
backend/Makefile (1)
50-52: LGTM! Simplified dump workflow.The refactored dump-data target now delegates to the dedicated dump.sh script, which is cleaner and more maintainable than the previous inline Django command sequence.
.github/workflows/setup-e2e-environment/action.yaml (3)
10-15: Good addition of timeout safeguards.Adding 1-minute timeouts to the readiness loops prevents the workflow from hanging indefinitely if services fail to start. This improves CI reliability and provides faster feedback on failures.
Also applies to: 57-62
10-15: Verify 1-minute timeout is adequate for CI environments.The 1-minute timeout for both database and backend readiness might be too short in resource-constrained or slow CI runners, potentially causing false failures. Consider whether this duration provides enough margin for typical startup times plus occasional CI slowness.
Monitor the first few CI runs after merging to ensure:
- Services consistently start within the timeout
- No spurious timeout failures occur during slow CI periods
If timeout failures occur, consider increasing to 2-3 minutes or making it configurable via environment variable.
Also applies to: 57-62
26-26: Unable to verify file existence due to repository access restrictions.The review comment requests verification that
backend/data/nest.sql.gzexists, but I cannot access the OWASP Nest repository to confirm this. The PostgreSQL import command syntax is correct (gunzip -c ... | psql), and the filenamenest.sql.gzfollows standard PostgreSQL dump conventions.The verification should be completed by:
- Running the proposed script locally to confirm the file exists and is valid
- Ensuring the file path is updated consistently across any related Makefile or documentation changes
- Verifying the data file contains appropriate E2E test fixtures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
backend/data/dump.sh (2)
14-15: Add environment variable validation at script entry.The script doesn't validate that required environment variables are set before use. If any variable is empty or unset, PostgreSQL commands fail with cryptic errors downstream. Building on prior feedback, add explicit validation.
Apply this diff to validate required variables early:
#!/bin/sh +# cspell:ignore PGPASSWORD Atqc nspname attisdropped set -e +# Validate required environment variables +: "${DJANGO_DB_PASSWORD:?DJANGO_DB_PASSWORD is required}" +: "${DJANGO_DB_HOST:?DJANGO_DB_HOST is required}" +: "${DJANGO_DB_USER:?DJANGO_DB_USER is required}" +: "${DJANGO_DB_NAME:?DJANGO_DB_NAME is required}" + # Cleanup function to ensure temp DB is dropped even on error cleanup() { if [ -n "$TEMP_DB" ]; then echo "Cleaning up temporary database $TEMP_DB..." psql -h "$DJANGO_DB_HOST" -U "$DJANGO_DB_USER" -d postgres -c "DROP DATABASE IF EXISTS $TEMP_DB;" 2>/dev/null || true fi } trap cleanup EXIT export PGPASSWORD="$DJANGO_DB_PASSWORD"This adds the CSpell configuration (which was flagged in prior CI runs) and validates all critical variables before any database operations.
6-11: Add explicit return statement to cleanup function for SonarCloud compliance.SonarCloud flags shell functions without explicit
returnstatements. While this is often a false positive for shell cleanup functions, adding an explicit return improves linter compliance.Apply this diff:
cleanup() { if [ -n "$TEMP_DB" ]; then echo "Cleaning up temporary database $TEMP_DB..." psql -h "$DJANGO_DB_HOST" -U "$DJANGO_DB_USER" -d postgres -c "DROP DATABASE IF EXISTS $TEMP_DB;" 2>/dev/null || true fi + return 0 }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/data/dump.sh(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
🪛 GitHub Check: SonarCloud Code Analysis
backend/data/dump.sh
[warning] 6-6: Add an explicit return statement at the end of the function.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run CI Denendencies Scan
- GitHub Check: Run Code Scan
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (1)
backend/data/dump.sh (1)
38-43: LGTM.The conditional check before applying UPDATE statements is sound—it gracefully handles cases where no email columns exist. Combined with the error handling (
set -e, trap cleanup), the script is robust and properly addresses the prior feedback on error handling and cleanup safeguards.
backend/apps/api/decorators/cache.py
Outdated
| def decorator(view_func): | ||
| @wraps(view_func) | ||
| def _wrapper(request, *args, **kwargs): | ||
| if settings.IS_E2E_ENVIRONMENT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's add cache service for e2e instead. The closer to production architecture the better -- for both local and CI/CD cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
.github/workflows/setup-e2e-environment/action.yaml (2)
51-56: Use a known health endpoint for backend readiness check instead of/a.The readiness check probes
/a, which is a non-standard, arbitrary path. If this endpoint doesn't exist or is removed, the check will silently pass but the API may not be fully ready. Use a dedicated health or status endpoint (e.g.,/api/health,/healthz, or a known public endpoint) to ensure the backend is genuinely operational.- until wget --spider http://localhost:9000/a; do + until wget --spider http://localhost:9000/api/health; doVerify that the chosen endpoint exists, is publicly accessible (no auth required), and reflects backend readiness.
60-64: Add error handling for the data load step.If the
psqlcommand fails (e.g., due to connection issues, SQL errors, or permission problems), the workflow silently continues. This can mask data load failures and cause misleading E2E test results.- name: Load Postgres data env: PGPASSWORD: nest_user_e2e_password run: | gunzip -c backend/data/nest.sql.gz | psql -h localhost -U nest_user_e2e -d nest_db_e2e + if [ $? -ne 0 ]; then + echo "Failed to load Postgres data" + exit 1 + fi shell: bashOr, use
set -eto fail fast on any command error.backend/pyproject.toml (1)
97-100: Scoped Ruff ignore looks fine; consider narrowing if more subprocess usage is added later.Limiting
S603/S607ignores to justdump_data.pyis a reasonable compromise given the trusted, ops-only nature of that command. If this file grows more subprocess logic over time, consider moving to line-level ignores (# noqa: S603,S607) around the specificpsql/pg_dumpinvocations so future additions in the same file still benefit from those checks.backend/Makefile (1)
50-76: Make SQL‑based load robust against missing/corrupt dump files.Switching
load-data/load-data-e2eto pipenest.sql.gzdirectly intopsqlis a good fit for the new dump command and avoids the previous schema‑drop behavior. To harden this a bit, consider explicitly checking that the dump file exists (and is non‑empty) before piping it, so a missing/corrupt dump can fail the make target instead of just emitting agunzipwarning:load-data: @echo "Loading Nest data" @test -s backend/data/nest.sql.gz || { echo "Missing or empty backend/data/nest.sql.gz"; exit 1; } @gunzip -c backend/data/nest.sql.gz | docker exec -i nest-db psql -U nest_user_dev -d nest_db_dev load-data-e2e: @echo "Loading Nest e2e data" @test -s backend/data/nest.sql.gz || { echo "Missing or empty backend/data/nest.sql.gz"; exit 1; } @gunzip -c backend/data/nest.sql.gz | docker exec -i e2e-nest-db psql -U nest_user_e2e -d nest_db_e2eThis keeps behavior the same on the happy path while surfacing issues earlier when the dump file is unavailable.
backend/apps/common/management/commands/dump_data.py (3)
20-30: CLI--tablearguments are additive to the defaults (not replacing them).Because
tablesusesaction="append"with a non‑empty default list, passing--tablewill append to the default patterns instead of replacing them:
- No
--table→["public.owasp_*", "public.github_*", "public.slack_*"]--table public.foo→["public.owasp_*", "public.github_*", "public.slack_*", "public.foo"]If that additive behavior is intentional, it might be worth clarifying it in the help text (e.g., “may be specified multiple times to add tables; defaults are always included”). If you intended CLI tables to override the defaults, you can instead:
Use
default=Noneinadd_arguments, andIn
handle, do something like:tables = options["tables"] or ["public.owasp_*", "public.github_*", "public.slack_*"]Also applies to: 40-41
44-60: Temp database name can cause collisions; consider making it unique per run.
temp_db = f"temp_{name}"combined with an unconditionalDROP DATABASE IF EXISTS {temp_db}infinallymeans:
- Two concurrent runs against the same source DB will race on the same temp DB name; one run can drop the other’s temp DB in the middle of its dump.
- If the main DB name ever contains characters invalid in unquoted identifiers,
CREATE DATABASE {temp_db} TEMPLATE {name};will fail.For this ops‑only command it’s not a blocker, but you can make it more robust by incorporating a unique suffix (PID, timestamp, or both) and quoting via
psql:import os import time temp_db = f"temp_{name}_{os.getpid()}_{int(time.time())}"and then using a small helper to generate safe SQL (e.g. via
format('CREATE DATABASE %I TEMPLATE %I', ...)in a DO block executed with\gexec), if you want to fully guard against unusual DB names.Also applies to: 88-104
134-158: Subprocess wrapper is consistent; consider minor enhancements if you extend it later.The
_psqlhelper centralizes allpsqlcalls, usescheck=True, and passes credentials via environment (PGPASSWORD), which is a solid, simple pattern for this internal command. If this grows, a couple of optional tweaks you might consider:
- Add
text=Trueand pass strings instead of bytes to avoid manual.encode()for stdin cases.- Capture stderr on failures to include more context in the
CommandErrormessage (e.g., viarun(..., capture_output=True)and surfacinge.stderr).Not required for this PR, but these can make debugging dump failures easier when operating this command.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
backend/data/nest.sql.gzis excluded by!**/*.gz
📒 Files selected for processing (4)
.github/workflows/setup-e2e-environment/action.yaml(2 hunks)backend/Makefile(2 hunks)backend/apps/common/management/commands/dump_data.py(1 hunks)backend/pyproject.toml(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
📚 Learning: 2025-10-26T12:50:50.512Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
Applied to files:
backend/Makefile.github/workflows/setup-e2e-environment/action.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run backend tests
- GitHub Check: Run frontend unit tests
🔇 Additional comments (2)
.github/workflows/setup-e2e-environment/action.yaml (1)
44-45: I'll verify the review comment by examining the actual workflow file and investigating the database initialization sequence. Let me start by inspecting the action.yaml file.
<function_calls>
cat -n .github/workflows/setup-e2e-environment/action.yaml
</function_calls>#!/bin/bash # Read the full action.yaml file cat -n .github/workflows/setup-e2e-environment/action.yamlbackend/apps/common/management/commands/dump_data.py (1)
106-132: Email masking strategy is broad but appropriate for e2e; just be aware of side‑effects.The DO block walks all non‑system schemas and unconditionally runs:
UPDATE <schema>.<table> SET email = '';for every column named
- Any “email” column used for non‑PII purposes will also be blanked.
- Very large tables will get full‑table updates, which may be slow on big dumps.
Given this is an explicit data‑masking step for generating test/e2e dumps, this trade‑off is generally fine; just something to keep in mind if performance or more granular masking ever becomes a concern.
backend/apps/api/decorators/cache.py
Outdated
| def decorator(view_func): | ||
| @wraps(view_func) | ||
| def _wrapper(request, *args, **kwargs): | ||
| if settings.IS_E2E_ENVIRONMENT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's configure the cache backed for e2e via Django settings (locmem for now)
| "--table", | ||
| action="append", | ||
| dest="tables", | ||
| default=["public.owasp_*", "public.github_*", "public.slack_*"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't dump the entire slack app data atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ai code is still difficult to maintain. Also there are not tests for this file.
| def _hide_emails(self) -> str: | ||
| # Uses a DO block to UPDATE every column named 'email' in non-system schemas | ||
| return """ | ||
| DO $$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a more readable/maintainable way to get rid of email addresses?
| WHERE a.attname = 'email' | ||
| AND a.attnum > 0 | ||
| AND NOT a.attisdropped | ||
| AND n.nspname NOT IN ('pg_catalog','information_schema','pg_toast') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These magic names and pretty much the entire PL/SQL spell doesn't fit our clean and readable code approach. Yes, it's probably performs the best but it's not the only goal we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/apps/api/rest/v0/__init__.py (1)
60-70: E2E env behavior looks correct; consider DRYing up shared config.The E2E branch cleanly mirrors the local environment (no auth, no throttling, same
SITE_URL), which matches the intent for a test-only backend and keeps staging/prod untouched. To avoid duplication between the local and E2E branches, you could extract a small helper:@@ -api_settings = { - "auth": ApiKey(), # The `api_key` param name is based on the ApiKey class name. - "default_router": RouterPaginated(), - "description": "Open Worldwide Application Security Project API", - "docs": Swagger(settings={"persistAuthorization": True}), - "throttle": [AuthRateThrottle("10/s")], - "title": "OWASP Nest", - "version": "0.3.1", -} - -api_settings_customization = {} -if settings.IS_LOCAL_ENVIRONMENT: - api_settings_customization = { - "auth": None, - "servers": [ - { - "description": "Local", - "url": settings.SITE_URL, - } - ], - "throttle": [], - } -elif settings.IS_E2E_ENVIRONMENT: - api_settings_customization = { - "auth": None, - "servers": [ - { - "description": "E2E", - "url": settings.SITE_URL, - } - ], - "throttle": [], - } +api_settings = { + "auth": ApiKey(), # The `api_key` param name is based on the ApiKey class name. + "default_router": RouterPaginated(), + "description": "Open Worldwide Application Security Project API", + "docs": Swagger(settings={"persistAuthorization": True}), + "throttle": [AuthRateThrottle("10/s")], + "title": "OWASP Nest", + "version": "0.3.1", +} + + +def _no_auth_env_server(description: str) -> dict: + return { + "auth": None, + "servers": [ + { + "description": description, + "url": settings.SITE_URL, + } + ], + "throttle": [], + } + + +api_settings_customization = {} +if settings.IS_LOCAL_ENVIRONMENT: + api_settings_customization = _no_auth_env_server("Local") +elif settings.IS_E2E_ENVIRONMENT: + api_settings_customization = _no_auth_env_server("E2E") @@ elif settings.IS_STAGING_ENVIRONMENT:
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/setup-e2e-environment/action.yaml(2 hunks)backend/apps/api/rest/v0/__init__.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-26T12:50:50.512Z
Learnt from: ahmedxgouda
Repo: OWASP/Nest PR: 2429
File: backend/Makefile:30-32
Timestamp: 2025-10-26T12:50:50.512Z
Learning: The `exec-backend-e2e-command` and `exec-db-e2e-command` Makefile targets in the backend/Makefile are intended for local development and debugging only, not for CI/CD execution, so the `-it` flags are appropriate.
Applied to files:
.github/workflows/setup-e2e-environment/action.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Run frontend e2e tests
- GitHub Check: Run frontend unit tests
- GitHub Check: Run backend tests
- GitHub Check: CodeQL (javascript-typescript)
🔇 Additional comments (4)
.github/workflows/setup-e2e-environment/action.yaml (4)
50-58: Verify the health check endpoint/aexists and is accessible.The workflow checks for backend readiness by hitting
http://localhost:9000/a, but this endpoint path appears arbitrary. If this endpoint doesn't exist, returns an error, or requires authentication, the workflow could timeout or fail even when the backend is functionally ready. Verify that/ais intentional or consider using a standard health check endpoint (e.g.,/health,/, or/api/).
60-64: Verify that loading data after backend startup is correct.Data loading has been moved to occur after backend startup and migrations (line 60), whereas previously it ran before backend startup. Since migrations execute before gunicorn starts (line 44), the database schema is initialized first, but confirm this sequence doesn't break expected behavior or require data to be present during any initialization steps.
35-47: Security: Verify--network hostis necessary.Using
--network hostexposes the container to all host network interfaces, which is a permissive configuration. Confirm this is required for your E2E testing or consider using an explicit Docker network bridge for better isolation.
8-16: Timeout pattern is well-structured.The
timeout 5mwrapper around thepg_isreadyretry loop is a solid defensive measure to prevent indefinite hangs. Similarly applied to the backend readiness check (lines 51–56).
|




Proposed change
Resolves #2709
csrf_decoratethat skips the csrf protection if the environment ise2e.Checklist
make check-testlocally; all checks and tests passed.