Skip to content

Bug/silent failures caused by swallowed errors#247

Open
Devnil434 wants to merge 7 commits intoAOSSIE-Org:mainfrom
Devnil434:bug/silent-failures-caused-by-swallowed-errors
Open

Bug/silent failures caused by swallowed errors#247
Devnil434 wants to merge 7 commits intoAOSSIE-Org:mainfrom
Devnil434:bug/silent-failures-caused-by-swallowed-errors

Conversation

@Devnil434
Copy link

@Devnil434 Devnil434 commented Jan 17, 2026

closes #239

Summary

This PR fixes multiple silent failures in the DebateAI backend where errors were ignored or swallowed in critical execution paths. The changes improve system reliability and debuggability by ensuring failures in database writes, matchmaking, and real-time communication are logged.


Changes

Debate Persistence & Outcomes

File: backend/controllers/debatevsbot_controller.go

  • Added logging for failures in db.SaveDebateVsBot
  • Added logging for failures in db.UpdateDebateVsBotOutcome
  • Added logging for services.SaveDebateTranscript failures in win/loss and concession flows

Profile & Elo Updates

File: backend/controllers/profile_controller.go

  • Added logging for default rating initialization failures
  • Added proper error handling and logging for user lookup failures during Elo updates
  • Added logging for Elo update failures and debate history insertion failures

Matchmaking & Room Watchers

File: backend/services/matchmaking.go

  • Added logging for unexpected WebSocket close errors
  • Added logging for JSON marshaling failures during pool status broadcasts
  • Added logging for MongoDB change stream initialization and decoding failures

Auth & Real-Time Communication

File: backend/controllers/auth.go

  • Added logging for persistUserStats failures during login and token verification

File: backend/websocket.go

  • Added logging for WebSocket write failures (participant updates, typing indicators, phase changes, broadcasts)

Why

Previously, several critical errors were silently ignored, making production issues difficult to detect and debug. These silent failures could lead to unnoticed data loss, incomplete updates, or inconsistent real-time state.


Impact

  • Improves observability and debuggability
  • Prevents silent data loss in critical flows
  • No behavior changes beyond improved error visibility
  • No new dependencies introduced

Verification

  • Backend code reviewed for correctness and logging coverage
  • go build ./... run; build failures observed only in pre-existing test_server.go (unrelated to this PR)

Notes

  • Uses Go’s standard log package, consistent with existing server-side logging practices
  • Changes are intentionally scoped to high-impact paths only

Summary by CodeRabbit

  • User Experience Updates

    • Minimum password length reduced from 8 to 6 across login, signup, and reset.
  • New Features

    • Multi-factor authentication (MFA): enable MFA, receive QR/secret, finalize setup, and verify TOTP during login.
    • Per-route rate limiting on auth endpoints to curb excessive attempts (may return HTTP 429).
  • System Improvements

    • Stronger encryption key management for stored secrets, improved error logging, and transactional integrity for rating updates.
  • Chores

    • Removed sample production config and trimmed example environment placeholders.
  • Tests

    • Added tests covering user MFA secret generation, encryption, and decryption.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 17, 2026

📝 Walkthrough

Walkthrough

Adds TOTP-based MFA (enable/finalize/verify) with encrypted secrets and HKDF-derived encryption key; introduces AES‑GCM encryption utilities; adds Redis-backed rate limiter and per-route limits; refactors Elo updates into MongoDB transactions; increases error logging across controllers and websockets; removes sample prod config and some env placeholders.

Changes

Cohort / File(s) Summary
Configuration & Env
\.env\.example, backend/config/config.prod.sample.yml
Removed sample production config file and several env placeholders (GEMINI_API_KEY, JWT_SECRET, GOOGLE_CLIENT_ID, SMTP_PASSWORD (comment), VITE_GOOGLE_CLIENT_ID, VITE_BASE_URL).
Auth & MFA
backend/controllers/auth.go, backend/routes/auth.go, backend/cmd/server/main.go
Added MFA endpoints (Enable/Finalize/Verify), MFA flow integration in Login/SignUp (returns 202 when MFA required), redacted email logging, and server HKDF key derivation + route wiring for MFA.
User Model & MFA Helpers
backend/models/user.go, backend/models/user_test.go
Added MFA fields (MFAEnabled, MFAType, MFASecret), methods to generate/verify TOTP and to encrypt/decrypt stored secret; added unit test exercising generation/encryption/decryption.
Encryption / Security
backend/security/encryption.go, backend/utils/auth.go, backend/go.mod
New AES‑GCM Encrypt/Decrypt and SetEncryptionKey; moved/added encryption helpers; HKDF usage added in server; dependencies updated (pquerna/otp, barcode, x/crypto, etc.).
Rate Limiting Middleware
backend/middlewares/rate_limit.go, backend/routes/auth.go
New Redis-backed RateLimit(limit, window) middleware and per-route application (SignUp/Login/ForgotPassword/VerifyTOTP) with fail-open behavior.
Transactional Elo / Profile
backend/controllers/profile_controller.go
UpdateEloAfterDebate refactored to use MongoDB sessions/transactions with validated IDs, transactional updates, and debate logs.
Debate vs Bot & Persistence Logging
backend/controllers/debatevsbot_controller.go
Replaced swallowed errors with explicit logging and error handling for save/update/transcript persistence paths.
WebSocket Logging
backend/websocket/matchmaking.go, backend/websocket/websocket.go
Added logging for read/write/encode errors and message forwarding failures across websocket code; no control-flow changes.
Admin Structs / Controllers
backend/structs/auth.go, backend/controllers/admin_controller.go
Moved AdminSignupRequest/AdminLoginRequest types to structs package and updated admin controller to use them.
Frontend Password Validation
frontend/src/Pages/Authentication/forms.tsx
Centralized MIN_PASSWORD_LENGTH = 6 and applied consistent client-side validation.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant API as Auth API
  participant DB as User DB
  participant Sec as Security (encryption)
  participant JWT as JWT Service
  participant Redis as Redis

  Client->>API: POST /login (email,password)
  API->>DB: Find user by email
  DB-->>API: user (includes MFAEnabled)
  alt MFAEnabled == true
    API-->>Client: 202 Accepted { mfaType, pendingToken }
    Client->>API: POST /login/mfa/verify (pendingToken, code)
    API->>DB: Load user by id
    API->>Sec: Decrypt user's MFA secret
    Sec-->>API: secret
    API->>API: Verify TOTP code
    alt valid
      API->>JWT: generate access token
      JWT-->>API: token
      API->>DB: Persist login stats (log errors)
      API-->>Client: 200 OK { token, user }
    else invalid
      API-->>Client: 401 Unauthorized
    end
  else
    API->>JWT: generate access token
    JWT-->>API: token
    API->>DB: Persist login stats (log errors)
    API-->>Client: 200 OK { token, user }
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hide a secret, stitch it tight,
I hop and hum a TOTP at night,
I spin a QR, I guard the key,
I log the hops so you can see,
A carrot cheers for safer bytes 🥕

🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes significant out-of-scope changes: MFA implementation (controllers, models, routes, security/encryption modules), admin signup/login endpoints, rate limiting middleware, and configuration file removal are unrelated to the stated 'swallowed errors' objective. Separate MFA features, admin authentication, rate limiting, and configuration removals into dedicated PRs focused on those specific features rather than bundling them with error handling fixes.
Docstring Coverage ⚠️ Warning Docstring coverage is 45.61% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Bug/silent failures caused by swallowed errors' directly describes the main change focus: adding error logging and handling for previously ignored errors across multiple backend components.
Linked Issues check ✅ Passed The PR successfully addresses issue #239 by adding error logging across critical backend paths (auth, debate, profile, WebSocket, matchmaking) and introducing MFA security features with proper encryption, fully meeting stated objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/controllers/profile_controller.go (1)

47-85: Redact or gate full query logging to avoid leaking sensitive data.

Logging Request URL, Raw Query, and full query map can capture tokens or PII if additional parameters are ever added. Consider whitelisting only userId or gating the verbose logs behind a debug flag. (Line 50–53)

🛡️ Proposed safer logging
- log.Printf("GetProfile: Request URL = '%s'", c.Request.URL.String())
- log.Printf("GetProfile: Raw Query = '%s'", c.Request.URL.RawQuery)
- log.Printf("GetProfile: Query params map = %v", c.Request.URL.Query())
- log.Printf("GetProfile: userId from c.Query() = '%s'", userIDParam)
+ log.Printf("GetProfile: userId from query = '%s'", userIDParam)
🤖 Fix all issues with AI agents
In `@backend/controllers/profile_controller.go`:
- Around line 387-392: The handler currently logs UpdateOne failures for the
winner/loser but still returns 200; change the logic around the two
db.MongoDatabase.Collection("users").UpdateOne calls (the winnerID/loserID
updates using dbCtx and req.WinnerID/req.LoserID) to propagate errors: if either
UpdateOne returns an error, abort the handler with a non-200 response (return
the error to caller), and ideally perform both updates inside a MongoDB
transaction/session so you can rollback on failure; ensure you return after
sending the error response so the handler does not continue to report success.

In `@backend/structs/auth.go`:
- Around line 3-6: Update the password policy and compensating controls: change
the SignUpRequest password binding from min=6 to min=8 (and apply the same
validation to AdminSignupRequest and AdminLoginRequest) to enforce a consistent
minimum; add or wire up a rate-limiting middleware on auth endpoints (signup,
login, password reset) and/or enable MFA hooks for account creation and login
flows; and add a short documented password policy comment near the structs
(SignUpRequest, AdminSignupRequest, AdminLoginRequest) describing length,
complexity, and where to get security approval for exceptions.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@backend/controllers/profile_controller.go`:
- Around line 376-377: The code is discarding errors from
primitive.ObjectIDFromHex for req.WinnerID and req.LoserID (variables winnerID,
loserID); validate both conversions and if either returns an error return a 400
Bad Request with a clear message (e.g., "invalid winner_id" or "invalid
loser_id") instead of proceeding to the FindOne calls. Update the handler in
profile_controller.go to check the error values from ObjectIDFromHex, produce a
JSON/HTTP 400 response when malformed, and only use the ObjectIDs when
conversion succeeds.

In `@backend/middlewares/rate_limit.go`:
- Around line 15-19: The code currently skips rate limiting when db.RedisClient
== nil but only comments "log a warning" without actually logging; update the
rate limit middleware to emit a warning before calling c.Next() (e.g., use the
standard log package or your existing app logger) so operators see "Redis
unavailable, skipping rate limiting" (include db context if available) and add
the "log" import (or the appropriate logger import) so the warning is compiled
and logged; target the db.RedisClient nil branch around the rate-limit
middleware where c.Next() is invoked.
- Around line 26-35: The Incr/Expire pair is non-atomic and can leave a key
without TTL if the process dies between calls; replace the two calls
(db.RedisClient.Incr and db.RedisClient.Expire on key/window) with a single
atomic Redis operation (either EVAL a small Lua script that does local =
redis.call("INCR", KEYS[1]); if local == 1 then redis.call("EXPIRE", KEYS[1],
ARGV[1]) end; return local, or use a pipeline/transaction that does SET NX EX
semantics combined with INCR) and update the rateLimit logic to use the script's
return value for the count; also ensure any Redis errors (the err returned from
the Redis call) are logged via the package's logger instead of being silently
ignored before calling c.Next(), so replace the current error branch to log the
error and then fail-open.

In `@backend/models/user.go`:
- Around line 35-38: The model currently exposes MFA fields (MFAEnabled,
MFAType, MFASecret) with a misleading comment; either remove these fields or
implement full TOTP support—if you implement, add a clear flow: use a TOTP
library like pquerna/otp to generate and validate secrets (create functions such
as GenerateMFASecret/EnableMFA and VerifyTOTP on the User model), encrypt
MFASecret before persisting using the project’s existing crypto utilities or a
secure AES/GCM wrapper (add SetEncryptedMFASecret/GetDecryptedMFASecret
helpers), and ensure MFASecret is never logged or returned in JSON by keeping
the json tag omitted; update any create/update handlers to call these helpers
and add unit tests for generation, encryption, and verification instead of
leaving the current placeholder comment.
🧹 Nitpick comments (3)
backend/controllers/auth.go (1)

77-101: Missing MFA field initialization for Google OAuth users.

New users created via Google OAuth do not have MFAEnabled and MFAType fields initialized, unlike SignUp (lines 181-183). While MFAEnabled defaults to false (Go zero value), explicitly initializing these fields ensures consistency and prevents potential issues if the default MFA type logic changes.

♻️ Suggested fix
 		newUser := models.User{
 			Email:            email,
 			DisplayName:      nickname,
 			Nickname:         nickname,
 			Bio:              "",
 			Rating:           1200.0,
 			RD:               350.0,
 			Volatility:       0.06,
 			LastRatingUpdate: now,
 			AvatarURL:        avatarURL,
 			IsVerified:       true,
 			Score:            0,          // Initialize gamification score
 			Badges:           []string{}, // Initialize badges array
 			CurrentStreak:    0,          // Initialize streak
 			CreatedAt:        now,
 			UpdatedAt:        now,
+			MFAEnabled:       false,
+			MFAType:          "totp",
 		}
backend/routes/auth.go (1)

15-22: Inefficient middleware instantiation pattern.

Calling middlewares.RateLimit(5, time.Minute)(c) inside the handler creates a new middleware closure on every request. The idiomatic approach is to register middleware once on the route. This applies to all three rate-limited handlers.

♻️ Suggested refactor for route registration

In the route registration file (likely main.go or a routes setup file), register the middleware on the route:

// Instead of calling middleware inside handlers, register on routes:
authGroup := router.Group("/auth")
authGroup.POST("/signup", middlewares.RateLimit(5, time.Minute), controllers.SignUp)
authGroup.POST("/login", middlewares.RateLimit(10, time.Minute), controllers.Login)
authGroup.POST("/forgot-password", middlewares.RateLimit(3, time.Minute), controllers.ForgotPassword)

Then simplify the handlers:

 func SignUpRouteHandler(c *gin.Context) {
-	// Apply rate limiting: 5 requests per minute
-	middlewares.RateLimit(5, time.Minute)(c)
-	if c.IsAborted() {
-		return
-	}
 	controllers.SignUp(c)
 }
backend/controllers/profile_controller.go (1)

49-86: Consider reducing verbose debug logging.

The extensive log.Printf calls for userId extraction (lines 51-54, 61, 76, 82, 85, 89, 100, 105, 134) are helpful during development but may be excessive in production. Consider using a debug log level or removing after the feature is stable.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/controllers/profile_controller.go (1)

50-105: Redact PII in GetProfile logging.

These logs include raw query strings and user email (Line 52, Line 105), which can leak PII or tokens. Please remove or redact sensitive fields, or gate them behind a debug-only logger.

🔒 Suggested redaction
- log.Printf("GetProfile: Raw Query = '%s'", c.Request.URL.RawQuery)
- log.Printf("GetProfile: Query params map = %v", c.Request.URL.Query())
- log.Printf("GetProfile: userId from c.Query() = '%s'", userIDParam)
+ log.Printf("GetProfile: userId param present=%t", userIDParam != "")
...
- log.Printf("GetProfile: Found user - ID: %s, Email: %s, DisplayName: %s", user.ID.Hex(), user.Email, user.DisplayName)
+ log.Printf("GetProfile: Found user - ID: %s, DisplayName: %s", user.ID.Hex(), user.DisplayName)
backend/go.mod (1)

29-60: github.com/pquerna/otp should be classified as a direct dependency.

The module github.com/pquerna/otp is marked as // indirect in go.mod (line 60), but github.com/pquerna/otp/totp is directly imported in backend/models/user.go:8. Since importing a subpackage means importing the module itself, github.com/pquerna/otp should be a direct dependency. Run go mod tidy to correct this classification. The github.com/boombuler/barcode dependency remains correctly marked as indirect (it's only needed transitively through pquerna/otp).

🤖 Fix all issues with AI agents
In `@backend/cmd/server/main.go`:
- Around line 64-65: You are currently reusing cfg.JWT.Secret for both signing
and encryption via utils.SetJWTSecret and security.SetEncryptionKey; instead,
stop passing the JWT secret directly to security.SetEncryptionKey and either
read a dedicated encryption key from configuration/environment (e.g.,
cfg.Encryption.Key) or derive a separate key from cfg.JWT.Secret using HKDF with
a distinct context/info label (e.g., "mfa-encryption") before calling
security.SetEncryptionKey so signing and encryption keys remain distinct and
rotatable.

In `@backend/controllers/auth.go`:
- Around line 626-649: The current TOTP verification allows exchanging
email+code for a JWT without ensuring the password step completed; modify the
flow so VerifyTOTP (in this handler) only runs when the user has MFA enabled
(check user.MFAEnabled) and there is a valid short‑lived pending MFA context
issued at login (e.g., require and validate a pendingMFA token or server‑side
pending state created by /login) that ties this verification to a completed
password authentication; if the pendingMFA token is missing/expired or
user.MFAEnabled is false, return an appropriate 401/400 response and do not
issue a JWT.
- Line 106: Replace raw email exposure in log statements by logging a non-PII
identifier: change occurrences that use the variable email (e.g., the log.Printf
call "Error persisting user stats for %s: %v", email, err and the other
log.Printf lines around 298-299 and 315-326) to use a redacted or hashed form
(e.g., call a small helper like RedactEmail(email) or HashIdentifier(email) and
log that instead), or log the internal user ID if available; apply the same
replacement for all log.Printf/error log sites in this file that currently print
email.
- Around line 651-657: In VerifyTOTP, load the config once into a local variable
(e.g., cfg := loadConfig(ctx)) and check for nil; if nil return an error
response instead of proceeding. Call generateJWT with cfg.JWT.Secret and
cfg.JWT.Expiry and capture its error (e.g., token, err := generateJWT(...)); if
generateJWT returns an error, return a 500 JSON error instead of sending a 200
with an empty token. Ensure you still call buildUserResponse(user) only after
successful token creation and include the valid token in the "accessToken"
field.

In `@backend/routes/auth.go`:
- Around line 58-60: The VerifyTOTPRouteHandler for the /login/mfa/verify
endpoint is missing rate limiting; wrap the handler with middlewares.RateLimit
so MFA verification can't be brute-forced. Update the route registration to
apply middlewares.RateLimit (use a stricter limit than the regular login) before
calling controllers.VerifyTOTP, or modify VerifyTOTPRouteHandler to call
middlewares.RateLimit(...) -> controllers.VerifyTOTP, ensuring you reference
VerifyTOTPRouteHandler, controllers.VerifyTOTP and middlewares.RateLimit when
making the change.

In `@backend/security/encryption.go`:
- Around line 13-32: SetEncryptionKey currently allows an empty string which
results in an all‑zero encryptionKey; change SetEncryptionKey(key string) so it
validates key != "" and fails fast by returning an error (e.g., func
SetEncryptionKey(key string) error) rather than silently accepting it; inside
the keyOnce.Do wrapper check if key == "" and set the returned error (or panic
only if you cannot change the signature), and update callers to handle the
error; reference SetEncryptionKey, encryptionKey and keyOnce when making the
change to ensure you prevent setting a zeroed 32‑byte key.
🧹 Nitpick comments (1)
backend/utils/auth.go (1)

4-12: Avoid duplicate encryption helpers across packages.

Encrypt/Decrypt here mirrors backend/security/encryption.go. Consider consolidating the logic in one place to prevent drift and keep key handling consistent.

Also applies to: 168-237

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/go.mod (1)

1-81: Inconsistency between PR objectives and dependency changes.

The PR objectives state this is a bug fix focused on adding logging and error handling to prevent silent failures. However, the dependency changes introduce MFA-related libraries (github.com/pquerna/otp, github.com/boombuler/barcode) and significant cryptography updates that are unrelated to logging/error handling.

This suggests either:

  1. Scope creep—MFA features were bundled into a logging/error-handling PR
  2. The PR objectives are incomplete or inaccurate

Consider splitting unrelated feature work (MFA) from the bug fix (logging) to simplify review, reduce merge risk, and maintain clear change history.

🤖 Fix all issues with AI agents
In `@backend/cmd/server/main.go`:
- Around line 66-81: Validate cfg.JWT.Secret before deriving keys: ensure
cfg.JWT.Secret is non-empty and meets a minimum entropy/length (e.g., at least
32 bytes/characters) immediately after loading config and before calling
utils.SetJWTSecret / hkdf.New; if the secret is empty or too short, fail fast
with a clear log.Fatalf error mentioning cfg.JWT.Secret validation so hkdf.New
and security.SetEncryptionKey are never called with a weak secret. You can
alternatively add this check in config.LoadConfig() and expose validation there,
but include the guard in main() around the existing use of cfg.JWT.Secret,
hkdf.New, and security.SetEncryptionKey.

@Devnil434
Copy link
Author

@bhavik-mangla plz review & merge it ...........

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Silent Failures Due to Swallowed Errors in Critical Backend Paths

1 participant