Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: truncate useragent #1143

Merged
merged 5 commits into from
Dec 6, 2024
Merged

fix: truncate useragent #1143

merged 5 commits into from
Dec 6, 2024

Conversation

markusahlstrand
Copy link

@markusahlstrand markusahlstrand commented Dec 6, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a new utility function, getClientInfo, to streamline the gathering of client information (IP address and user agent) across various authentication flows.
  • Improvements

    • Enhanced clarity in variable naming, such as renaming session to loginSession in multiple authentication routes.
    • Updated login session creation processes to incorporate additional client information, improving session management.
    • Refactored login session handling in routes to utilize centralized client information retrieval.
  • Bug Fixes

    • Maintained consistent error handling and control flow across authentication processes, ensuring reliability during user login and verification.

Copy link

coderabbitai bot commented Dec 6, 2024

Walkthrough

The changes in this pull request primarily involve the introduction of a new utility function, getClientInfo, which centralizes the extraction of client information (user agent and IP address) from request headers. This function is integrated into various authentication flows, including OAuth2, passwordless, and SAML routes, enhancing the clarity and consistency of how client information is handled. Additionally, several variables have been renamed for improved clarity, particularly changing instances of session to loginSession. The overall logic and error handling in the affected functions remain unchanged.

Changes

File Path Change Summary
src/authentication-flows/oauth2.tsx Added getClientInfo import; renamed session to loginSession; updated login session creation logic.
src/authentication-flows/password.ts Added getClientInfo import; replaced direct header access with getClientInfo in requestPasswordReset.
src/authentication-flows/passwordless.ts Added getClientInfo import; refactored sendEmailVerificationEmail to use getClientInfo.
src/authentication-flows/universal.ts Added getClientInfo import; renamed login to loginSession; updated login session creation logic.
src/helpers/generate-auth-response.ts Added getClientInfo import; replaced direct header access with getClientInfo in generateCode.
src/routes/oauth2/authenticate.ts Added getClientInfo import; replaced direct header access with getClientInfo in authenticateRoutes.
src/routes/oauth2/dbconnections.ts Added getClientInfo import; replaced direct header access with getClientInfo in change_password.
src/routes/oauth2/passwordless.ts Removed nanoid import; renamed login to loginSession; updated login session creation to use getClientInfo.
src/routes/saml/samlp.ts Added getClientInfo import; renamed login to loginSession; updated login session creation logic.
src/utils/client-info.ts Added new function getClientInfo to extract client information from headers.

Possibly related PRs

  • fix: use arctic for google login #1125: This PR modifies the socialAuth and oauth2Callback functions in the same file as the main PR, enhancing the OAuth2 flow to support Google authentication, which is closely related to the changes made in the main PR.
  • fix: use connection name for single connection #1129: This PR also updates the socialAuth and oauth2Callback functions in src/authentication-flows/oauth2.tsx, adding logging and maintaining the existing logic for handling different authentication strategies, which aligns with the changes in the main PR.
  • fix: write useragent and ip to login #1142: This PR introduces modifications to the login session by adding useragent and ip properties in various functions, including those in src/authentication-flows/password.ts, which is relevant to the changes in the main PR that also enhance session data.

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
src/utils/client-info.ts (1)

1-7: Consider adding input sanitization

While truncation prevents overflow, consider adding sanitization to protect against malicious input in headers.

Example implementation:

+function sanitizeHeader(value: string | null, maxLength: number): string | undefined {
+  if (!value) return undefined;
+  return value.replace(/[^\x20-\x7E]/g, '').slice(0, maxLength);
+}
+
 export function getClientInfo(headers: Headers) {
   return {
-    auth0Client: headers.get("auth0-client")?.slice(0, 256),
-    ip: headers.get("x-real-ip")?.slice(0, 29),
-    useragent: headers.get("user-agent")?.slice(0, 256),
+    auth0Client: sanitizeHeader(headers.get("auth0-client"), 256),
+    ip: sanitizeHeader(headers.get("x-real-ip"), 45),
+    useragent: sanitizeHeader(headers.get("user-agent"), 256),
   };
 }
src/authentication-flows/password.ts (1)

18-18: Consider enhancing getClientInfo implementation

The getClientInfo utility function should:

  1. Explicitly document its return type and property names
  2. Implement user agent truncation (as per PR title)
  3. Validate and sanitize client information
  4. Consider using TypeScript interfaces for better type safety

This will help prevent potential property conflicts when using spread operator and ensure consistent handling of client information across the application.

src/helpers/generate-auth-response.ts (1)

Line range hint 127-134: Consider consolidating IP address handling.

There appears to be duplicate IP address handling logic. The code updates last_ip using cf-connecting-ip header, while getClientInfo might also handle IP extraction. Consider consolidating this logic to use getClientInfo consistently.

- last_ip: ctx.req.header("cf-connecting-ip"),
+ ...getClientInfo(ctx.req.raw.headers),
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2191e82 and 4baf3d3.

📒 Files selected for processing (10)
  • src/authentication-flows/oauth2.tsx (2 hunks)
  • src/authentication-flows/password.ts (2 hunks)
  • src/authentication-flows/passwordless.ts (2 hunks)
  • src/authentication-flows/universal.ts (4 hunks)
  • src/helpers/generate-auth-response.ts (2 hunks)
  • src/routes/oauth2/authenticate.ts (2 hunks)
  • src/routes/oauth2/dbconnections.ts (2 hunks)
  • src/routes/oauth2/passwordless.ts (2 hunks)
  • src/routes/saml/samlp.ts (3 hunks)
  • src/utils/client-info.ts (1 hunks)
🔇 Additional comments (14)
src/authentication-flows/universal.ts (1)

29-34: LGTM! Clean implementation of client info collection

The changes effectively integrate the new getClientInfo utility while improving code clarity through better variable naming.

src/routes/oauth2/authenticate.ts (1)

113-113: LGTM! Verify consistent implementation across authentication flows

The integration of getClientInfo is correct. However, let's ensure consistent implementation across all authentication paths.

✅ Verification successful

Implementation is consistent across authentication flows

The verification shows that:

  • getClientInfo is properly centralized in src/utils/client-info.ts and is the only place handling direct header access
  • It's consistently used across all relevant authentication flows:
    • OAuth2 authentication (src/routes/oauth2/authenticate.ts)
    • SAML flows (src/routes/saml/samlp.ts)
    • Database connections (src/routes/oauth2/dbconnections.ts)
    • Passwordless authentication (src/routes/oauth2/passwordless.ts)
    • Universal login flows (src/authentication-flows/universal.ts)
    • Password flows (src/authentication-flows/password.ts)

The implementation properly sanitizes headers with length limits and is used consistently throughout the authentication system.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistent usage of getClientInfo across authentication flows

# Check for any direct header access that might bypass getClientInfo
echo "Checking for direct header access patterns..."
rg -l "headers\.get\(['\"]user-agent['\"]\)" --type ts
rg -l "headers\.get\(['\"]x-real-ip['\"]\)" --type ts
rg -l "headers\.get\(['\"]auth0-client['\"]\)" --type ts

# Check for consistent import and usage of getClientInfo
echo "Checking for getClientInfo usage patterns..."
rg -l "import.*getClientInfo.*from.*['\"]\.\.\/utils\/client-info['\"]" --type ts

Length of output: 711


Script:

#!/bin/bash
# Description: Verify consistent usage of getClientInfo across authentication flows

# Check for any direct header access patterns
echo "Checking for direct header access patterns..."
rg -l "headers\.(get|\\[)" --type ts

# Check for getClientInfo usage patterns with fixed regex
echo "Checking for getClientInfo usage patterns..."
rg -l "getClientInfo" --type ts

# Show implementation of getClientInfo for verification
echo "Showing getClientInfo implementation..."
rg -A 5 "function getClientInfo" --type ts

# Check for other authentication flows
echo "Checking authentication-related files..."
fd -e ts -p "auth|login|sign"

Length of output: 4684

src/routes/saml/samlp.ts (1)

Line range hint 122-141: Verify spread operator safety with getClientInfo

The spread operator usage with getClientInfo could potentially override other properties in the login session object. Consider destructuring the specific properties needed instead.

Consider this safer alternative:

- ...getClientInfo(ctx.req.raw.headers),
+ const { useragent, ip, auth0Client } = getClientInfo(ctx.req.raw.headers);
+ useragent,
+ ip,
+ auth0Client,
✅ Verification successful

Let me verify what properties are expected in the login session object to ensure there are no conflicts.


Let me check the actual login session type/interface to see all its properties.


Let me try a different approach to find the login session type and its usage.


Let me check the migration files to understand the login session schema structure.


Based on the evidence gathered, I can now generate the final response:

Spreading getClientInfo properties is safe in this context

The spread of getClientInfo properties (auth0Client, ip, useragent) into the login session is safe as these properties don't conflict with any of the core login session fields (authParams, expires_at, id, tenant_id, client_id, etc.) defined in the schema. The properties are also properly sanitized with length limits in the getClientInfo function.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if getClientInfo returns any properties that could conflict with login session
ast-grep --pattern 'export function getClientInfo($_) {
  return {
    $$$
  }
}'

Length of output: 506


Script:

#!/bin/bash
# Check the login session interface/type definition
ast-grep --pattern 'interface $_ {
  $$$
  authParams: $_
  $$$
}'

# Also check for type definitions
ast-grep --pattern 'type $_ = {
  $$$
  authParams: $_
  $$$
}'

Length of output: 2117


Script:

#!/bin/bash
# Search for login session type definition
ast-grep --pattern 'type LoginSession = {
  $$$
}'

# Also check interface definition
ast-grep --pattern 'interface LoginSession {
  $$$
}'

# Check the create method signature to understand expected properties
ast-grep --pattern 'create($_, $_: {
  $$$
})'

Length of output: 157


Script:

#!/bin/bash
# Search for any type definitions or interfaces containing 'LoginSession'
rg -A 10 "type.*LoginSession|interface.*LoginSession"

# Search for the logins.create method usage to understand the type
rg -A 5 "logins\.create"

# Look for any files that might contain the login session definition
fd -e ts -e tsx "login.*session" --type f

Length of output: 9810


Script:

#!/bin/bash
# Check the universal login session migration files for schema
rg -A 10 "createTable|addColumn" migrate/migrations/2023-11-19T20:53:00_univeral-login-session.ts migrate/migrations/2024-05-16T10:45:00_vendor_id_in_universal_login_session.ts migrate/migrations/2024-05-23T15:53:00_auth0client_in_universal_login_session.ts

Length of output: 5078

src/authentication-flows/password.ts (1)

67-67: Verify user agent truncation implementation

The PR title mentions truncating user agent, but the truncation implementation isn't visible here. Need to verify if getClientInfo handles this.

✅ Verification successful

User agent truncation is properly implemented

The getClientInfo function in src/utils/client-info.ts correctly truncates the user agent to 256 characters using slice(0, 256).

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check getClientInfo implementation for user agent truncation
ast-grep --pattern 'export function getClientInfo($_) {
  $$$
  useragent = $$$
  $$$
}'

# Check for any string length or substring operations on user agent
rg -A 3 'useragent.*(?:substring|slice|substr|length)'

Length of output: 286

src/routes/oauth2/dbconnections.ts (2)

14-14: LGTM! Consistent with the centralization approach.

The import of getClientInfo aligns with the PR's objective to standardize client information extraction.


169-169: Consider adding error handling for missing headers.

While the implementation is correct, it might be good to ensure the function handles cases where headers are undefined or malformed.

src/authentication-flows/passwordless.ts (2)

26-26: LGTM! Consistent import placement.

The import follows the established pattern and is grouped logically with other utility imports.


125-131: LGTM! Clean implementation of login creation.

The implementation correctly uses getClientInfo to extract and include client information in the login creation process. The code maintains a clear separation of concerns and follows the established pattern.

src/helpers/generate-auth-response.ts (2)

30-30: LGTM! Consistent import.

The import follows the established pattern and is grouped logically with other utility imports.


88-88: LGTM! Clean implementation.

The implementation correctly uses getClientInfo to extract client information.

src/authentication-flows/oauth2.tsx (4)

24-24: LGTM: Clean import of client info utility

The addition of the getClientInfo utility import aligns well with the PR's objective of handling user agent information.


50-50: LGTM: Improved variable naming clarity

The renaming from session to loginSession improves code clarity by being more specific about the variable's purpose. The change is applied consistently throughout the file.

Also applies to: 55-55, 67-67


56-62: Verify user agent truncation implementation

While the integration of client info looks good, we should verify that the getClientInfo utility properly implements the user agent truncation as per the PR's objective.

Let's check the implementation of the getClientInfo utility:

✅ Verification successful

User agent truncation is properly implemented

The getClientInfo utility in src/utils/client-info.ts correctly implements user agent truncation by using slice(0, 256) on the user-agent header value, which ensures the string won't exceed 256 characters.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of user agent truncation in getClientInfo utility

# Search for the getClientInfo implementation
rg -A 10 "export.*getClientInfo" "src/utils/client-info"

# Look for any user agent length constraints or truncation logic
rg -i "useragent.*length|truncate.*useragent" "src/utils/client-info"

Length of output: 250


Script:

#!/bin/bash
# Let's first find the correct location of the getClientInfo utility
fd -t f "client-info"

# Then search for getClientInfo implementation across the codebase
rg -A 10 "export.*getClientInfo"

# Also search for any user agent related code
rg -i "useragent|user.?agent"

Length of output: 2858


56-62: Review logging of client information

Since we're collecting and storing client information in the login session, ensure that:

  1. No sensitive PII is being logged
  2. User agent string is properly sanitized
  3. IP addresses are handled according to privacy requirements

Let's check for any sensitive information logging:

✅ Verification successful

Client information handling appears secure

The implementation shows appropriate security measures:

  • User agent is truncated to 256 characters
  • IP addresses are truncated to 29 characters
  • Auth0 client information is truncated to 256 characters
  • No sensitive data is being logged (only stored with proper truncation)
  • Headers are accessed via safe getter methods
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential sensitive information logging

# Look for logging statements that might contain client info
rg "console.log.*client|ctx.set.*client" "src/authentication-flows/"

# Check for any debug logging of headers or client info
rg "log.*headers|console.*headers" "src/authentication-flows/"

Length of output: 440


Script:

#!/bin/bash
# Let's check the implementation of getClientInfo function to understand what data is being collected

# First, find the file containing getClientInfo
fd "getClientInfo"

# Then look for its implementation
ast-grep --pattern 'function getClientInfo($_) {
  $$$
}'

# Also check for any const/let/var declaration
ast-grep --pattern 'const getClientInfo = $_'

Length of output: 549

src/utils/client-info.ts Outdated Show resolved Hide resolved
src/routes/oauth2/passwordless.ts Outdated Show resolved Hide resolved
src/routes/oauth2/passwordless.ts Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (1)
src/utils/client-info.ts (1)

3-3: Add JSDoc documentation

Please add comprehensive documentation for this utility function:

  • Purpose and usage
  • Parameter description
  • Return value description
  • Security considerations
+/**
+ * Extracts and sanitizes client information from the request.
+ * 
+ * @param req - The Hono request object
+ * @returns ClientInfo object containing sanitized auth0Client, IP, and user agent
+ * @security Ensures proper truncation of potentially sensitive information
+ */
 export function getClientInfo(req: HonoRequest): ClientInfo {
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 4baf3d3 and cc2320e.

📒 Files selected for processing (10)
  • src/authentication-flows/oauth2.tsx (2 hunks)
  • src/authentication-flows/password.ts (2 hunks)
  • src/authentication-flows/passwordless.ts (2 hunks)
  • src/authentication-flows/universal.ts (4 hunks)
  • src/helpers/generate-auth-response.ts (2 hunks)
  • src/routes/oauth2/authenticate.ts (2 hunks)
  • src/routes/oauth2/dbconnections.ts (2 hunks)
  • src/routes/oauth2/passwordless.ts (2 hunks)
  • src/routes/saml/samlp.ts (3 hunks)
  • src/utils/client-info.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • src/routes/oauth2/passwordless.ts
  • src/helpers/generate-auth-response.ts
  • src/routes/oauth2/authenticate.ts
  • src/authentication-flows/universal.ts
  • src/routes/oauth2/dbconnections.ts
  • src/routes/saml/samlp.ts
  • src/authentication-flows/password.ts
  • src/authentication-flows/passwordless.ts
  • src/authentication-flows/oauth2.tsx
🔇 Additional comments (1)
src/utils/client-info.ts (1)

6-6: Review IP address truncation length

The current implementation truncates IP addresses to 29 characters, which might be too restrictive for IPv6 addresses that can be up to 45 characters long.

src/utils/client-info.ts Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
src/routes/oauth2/authorize.ts (2)

Line range hint 189-193: Consider adding type safety to universalAuth parameters

The updated parameter structure with named properties improves clarity. However, consider defining an interface for these parameters to ensure type safety and prevent potential missing properties.

interface UniversalAuthParams {
  ctx: Context;
  client: Client;
  auth0Client?: string;
  authParams: AuthParams;
  session?: Session;
  connection?: string;
  login_hint?: string;
}

Line range hint 1-195: Solid architectural decisions in auth flow refactoring

The changes maintain a clean separation of concerns while improving the consistency of client information handling across different authentication flows. The existing security measures (origin verification, redirect URI validation) remain intact.

Consider documenting the authentication flow decision tree (UI vs. non-UI strategies, single connection vs. multiple connections) in the codebase to help future maintainers understand the routing logic.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 81a530a and 45ecc17.

📒 Files selected for processing (4)
  • src/authentication-flows/oauth2.tsx (2 hunks)
  • src/routes/oauth2/authorize.ts (2 hunks)
  • src/routes/oauth2/passwordless.ts (2 hunks)
  • src/utils/client-info.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/utils/client-info.ts
  • src/authentication-flows/oauth2.tsx
  • src/routes/oauth2/passwordless.ts
🔇 Additional comments (2)
src/routes/oauth2/authorize.ts (2)

174-174: LGTM! Consistent with previous socialAuth call

The removal of auth0Client parameter is consistent with the previous instance, maintaining uniformity in the codebase.


169-169: Verify the socialAuth function signature change

The removal of the auth0Client parameter needs verification to ensure it matches the updated function signature in the authentication flows.

✅ Verification successful

Function signature change is correct

The socialAuth function signature in src/authentication-flows/oauth2.tsx matches the updated calls in src/routes/oauth2/authorize.ts. The function expects exactly four parameters: ctx, client, connectionName, and authParams. The removal of auth0Client parameter is consistent with the implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the socialAuth function signature in authentication flows
# Expected: Function should not require auth0Client parameter

ast-grep --pattern 'function socialAuth($_) {
  $$$
}'

Length of output: 56


Script:

#!/bin/bash
# Let's try a different approach to find socialAuth function
# First, let's use ripgrep to find the function definition
rg "socialAuth" -A 5

# Also search for any exports of socialAuth
rg "export.*socialAuth"

Length of output: 1622


Script:

#!/bin/bash
# Let's examine the full socialAuth function implementation
rg -A 20 "^export async function socialAuth"

Length of output: 1455

@markusahlstrand markusahlstrand merged commit ab01ce6 into main Dec 6, 2024
2 checks passed
@markusahlstrand markusahlstrand deleted the ma/refactor-useragent branch December 6, 2024 09:56
@coderabbitai coderabbitai bot mentioned this pull request Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant