-
Notifications
You must be signed in to change notification settings - Fork 0
Feature/ai module #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/ai module #124
Conversation
WalkthroughThis update removes the legacy Python-based AI summarization module and introduces a new TypeScript/Node.js Gemini-powered summarization service. It adds new controllers, services, types, configuration files, and validation middleware, while updating dependencies and project scripts. Test suites and environment configuration are adjusted to support the new implementation. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ExpressRouter
participant SummarizationController
participant SummarizationService
participant GeminiAPI
Client->>ExpressRouter: POST /ai/summarize (with text)
ExpressRouter->>SummarizationController: summarize(req, res, next)
SummarizationController->>SummarizationService: summarize(request)
SummarizationService->>GeminiAPI: generateContent(prompt, params)
GeminiAPI-->>SummarizationService: summary result
SummarizationService-->>SummarizationController: SummarizationResponse
SummarizationController-->>ExpressRouter: success response
ExpressRouter-->>Client: summary + metadata
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🧹 Nitpick comments (10)
backend/src/modules/ai/routes/summarization.routes.ts (1)
1-15: Well-structured Express route with proper validation.The route implementation follows Express best practices with proper middleware ordering and type safety. The validation middleware is correctly applied before the controller method.
Consider using dependency injection for the controller to improve testability:
-const summarizationController = new SummarizationController(); +const createController = () => new SummarizationController(); router.post( '/summarize', validateRequest(SummarizationRequestSchema), - summarizationController.summarize + createController().summarize );backend/src/modules/ai/types/models.types.ts (2)
1-4: Add JSDoc documentation for better type clarity.Consider adding JSDoc comments to document the purpose and usage of the AIModel enum and its values.
+/** + * Supported AI models for text summarization. + */ export enum AIModel { + /** Google's Gemini 2.0 Flash model */ Gemini20Flash = 'gemini-2.0-flash', + /** Google's Gemini 1.5 Flash model */ Gemini15Flash = 'gemini-1.5-flash', }
6-12: Add JSDoc documentation and consider validation ranges.The ModelConfig interface would benefit from documentation explaining the purpose of each parameter and their typical value ranges.
+/** + * Configuration parameters for AI model generation. + */ export interface ModelConfig { + /** Controls randomness in generation (0.0 to 2.0) */ temperature: number; + /** Maximum number of tokens to generate */ maxOutputTokens: number; + /** Controls diversity via nucleus sampling (0.0 to 1.0) */ topP: number; + /** Reduces likelihood of repeating token sequences (0.0 to 2.0) */ frequencyPenalty: number; + /** Reduces likelihood of repeating topics (0.0 to 2.0) */ presencePenalty: number; }backend/src/shared/middleware/validation.middleware.ts (1)
5-26: Consider improving error message formatting and security.The current implementation has a few potential improvements:
- Security: JSON.stringify in error messages might expose sensitive data
- Performance: Large validation errors could create unwieldy JSON strings
- Readability: Error messages could be more user-friendly
export const validateRequest = (schema: AnyZodObject) => { return async (req: Request, res: Response, next: NextFunction) => { try { await schema.parseAsync({ body: req.body, query: req.query, params: req.params, }); next(); } catch (error) { if (error instanceof ZodError) { const validationErrors = error.errors.map((e) => ({ path: e.path.join('.'), message: e.message, })); - next(new AppError(400, `Validation failed: ${JSON.stringify(validationErrors)}`, 'validation.middleware')); + const errorMessage = `Validation failed: ${validationErrors.map(e => `${e.path}: ${e.message}`).join(', ')}`; + next(new AppError(400, errorMessage, 'validation.middleware')); } else { next(error); } } }; };backend/src/modules/ai/config/models.ts (1)
3-18: Consider differentiating model configurations and adding environment-based customization.Currently, both Gemini models have identical configurations. Consider:
- Model-specific tuning: Different models may benefit from different parameters
- Environment-based configs: Development vs production might need different settings
+// Base configuration that can be overridden per model +const BASE_CONFIG: ModelConfig = { + temperature: 0.2, + maxOutputTokens: 1500, + topP: 1, + frequencyPenalty: 0.1, + presencePenalty: 0.1, +}; + export const MODEL_CONFIGS: Record<AIModel, ModelConfig> = { [AIModel.Gemini20Flash]: { - temperature: 0.2, - maxOutputTokens: 1500, - topP: 1, - frequencyPenalty: 0.1, - presencePenalty: 0.1, + ...BASE_CONFIG, + // Gemini 2.0 might benefit from slightly higher creativity + temperature: 0.3, }, [AIModel.Gemini15Flash]: { - temperature: 0.2, - maxOutputTokens: 1500, - topP: 1, - frequencyPenalty: 0.1, - presencePenalty: 0.1, + ...BASE_CONFIG, + // Keep more conservative settings for 1.5 }, };Alternatively, consider making some parameters environment-configurable:
const getEnvironmentConfig = (): Partial<ModelConfig> => ({ maxOutputTokens: parseInt(process.env.AI_MAX_TOKENS || '1500'), temperature: parseFloat(process.env.AI_TEMPERATURE || '0.2'), });backend/src/modules/ai/controllers/summarization.controller.ts (2)
7-12: Consider implementing proper dependency injection.The current implementation creates a service instance directly in the constructor. For better testability and flexibility, consider dependency injection.
export class SummarizationController { private service: SummarizationService; - constructor() { - this.service = new SummarizationService(); + constructor(service?: SummarizationService) { + this.service = service || new SummarizationService(); }This approach allows for easier unit testing by injecting mock services.
14-30: Enhance request validation and error context.The controller implementation is solid, but consider these improvements:
- Type safety: The request data casting could be more explicit
- Error context: Add more context to error logging
summarize = async (req: Request, res: Response, next: NextFunction) => { try { // Validation is handled by middleware - const requestData: SummarizationRequest = req.body; + const requestData = req.body as SummarizationRequest; + + logger.debug('Processing summarization request', { + textLength: requestData.text.length, + model: requestData.options?.model, + format: requestData.options?.format, + }); + const result = await this.service.summarize(requestData); logger.info('Successfully generated summary', { textLength: requestData.text.length, processingTime: result.metadata.processingTime, model: result.metadata.model, + summaryLength: result.summary.length, }); res.json(createSuccessResponse(result)); } catch (error) { + logger.error('Summarization controller error', { + error: error instanceof Error ? error.message : 'Unknown error', + textLength: req.body?.text?.length || 0 + }); next(error); } };backend/tests/user-module.spec.js (1)
70-82: Improve safety with optional chaining.The cookie checking logic is comprehensive, but can be made safer and more concise.
Apply this diff to use optional chaining as suggested by static analysis:
- if (response.headers && response.headers['set-cookie']) { - const cookies = response.headers['set-cookie']; + if (response.headers?.['set-cookie']) { + const cookies = response.headers['set-cookie'];🧰 Tools
🪛 Biome (1.9.4)
[error] 71-71: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
backend/tests/user-module.spec.ts (2)
15-15: Consider removing the unused variable instead of disabling the linter.The
authTokenvariable is assigned but never used. Consider either removing it or implementing token-based authentication in the tests.-// eslint-disable-next-line @typescript-eslint/no-unused-vars -let authToken: string | null = null; +let authToken: string | null = null; // TODO: Implement token-based auth tests
52-87: Excellent improvements to login test flexibility and debugging.The changes make the test more robust by:
- Avoiding direct userID comparison (more flexible for different test environments)
- Supporting backward compatibility with optional properties
- Adding helpful cookie debugging without failing tests
However, consider using optional chaining as suggested by the static analysis tool.
Apply the static analysis suggestion for cleaner code:
- if (response.headers && response.headers['set-cookie']) { + if (response.headers?.['set-cookie']) {🧰 Tools
🪛 Biome (1.9.4)
[error] 74-74: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
backend/src/modules/ai/poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (37)
backend/.env.sample(1 hunks)backend/jest.config.js(1 hunks)backend/package.json(3 hunks)backend/src/app.ts(2 hunks)backend/src/modules/ai/.env.sample(0 hunks)backend/src/modules/ai/Dockerfile(0 hunks)backend/src/modules/ai/Makefile(0 hunks)backend/src/modules/ai/app.py(0 hunks)backend/src/modules/ai/blueprints/general.py(0 hunks)backend/src/modules/ai/blueprints/summarize.py(0 hunks)backend/src/modules/ai/clear_pycache.sh(0 hunks)backend/src/modules/ai/config/langchain_ai.py(0 hunks)backend/src/modules/ai/config/models.ts(1 hunks)backend/src/modules/ai/config/prompts.ts(1 hunks)backend/src/modules/ai/config/rich_logging.py(0 hunks)backend/src/modules/ai/controllers/summarization.controller.ts(1 hunks)backend/src/modules/ai/lib/constants.py(0 hunks)backend/src/modules/ai/lib/sanitizer.py(0 hunks)backend/src/modules/ai/models/mai.py(0 hunks)backend/src/modules/ai/models/prompt.py(0 hunks)backend/src/modules/ai/podep.ps1(0 hunks)backend/src/modules/ai/podep.sh(0 hunks)backend/src/modules/ai/pyproject.toml(0 hunks)backend/src/modules/ai/requirements.txt(0 hunks)backend/src/modules/ai/routes/summarization.routes.ts(1 hunks)backend/src/modules/ai/run.ps1(0 hunks)backend/src/modules/ai/run.sh(0 hunks)backend/src/modules/ai/services/summarization.service.ts(1 hunks)backend/src/modules/ai/types/index.ts(1 hunks)backend/src/modules/ai/types/models.types.ts(1 hunks)backend/src/modules/ai/types/summarization.types.ts(1 hunks)backend/src/shared/middleware/validation.middleware.ts(1 hunks)backend/src/shared/utils/logger.ts(1 hunks)backend/tests/ai-module.spec.ts(1 hunks)backend/tests/user-module.spec.js(2 hunks)backend/tests/user-module.spec.ts(6 hunks)backend/tsconfig.json(1 hunks)
💤 Files with no reviewable changes (19)
- backend/src/modules/ai/run.sh
- backend/src/modules/ai/run.ps1
- backend/src/modules/ai/.env.sample
- backend/src/modules/ai/clear_pycache.sh
- backend/src/modules/ai/Dockerfile
- backend/src/modules/ai/pyproject.toml
- backend/src/modules/ai/requirements.txt
- backend/src/modules/ai/lib/constants.py
- backend/src/modules/ai/config/rich_logging.py
- backend/src/modules/ai/podep.sh
- backend/src/modules/ai/Makefile
- backend/src/modules/ai/blueprints/general.py
- backend/src/modules/ai/lib/sanitizer.py
- backend/src/modules/ai/app.py
- backend/src/modules/ai/podep.ps1
- backend/src/modules/ai/models/mai.py
- backend/src/modules/ai/config/langchain_ai.py
- backend/src/modules/ai/models/prompt.py
- backend/src/modules/ai/blueprints/summarize.py
🧰 Additional context used
🧬 Code Graph Analysis (5)
backend/src/modules/ai/routes/summarization.routes.ts (3)
backend/src/modules/ai/controllers/summarization.controller.ts (1)
SummarizationController(7-33)backend/src/shared/middleware/validation.middleware.ts (1)
validateRequest(5-26)backend/src/modules/ai/types/summarization.types.ts (1)
SummarizationRequestSchema(5-16)
backend/src/shared/middleware/validation.middleware.ts (1)
backend/src/shared/errors/index.ts (1)
AppError(1-11)
backend/src/modules/ai/controllers/summarization.controller.ts (4)
backend/src/modules/ai/services/summarization.service.ts (1)
SummarizationService(11-73)backend/src/modules/ai/types/summarization.types.ts (1)
SummarizationRequest(18-25)backend/src/shared/utils/logger.ts (1)
logger(28-49)backend/src/shared/utils/response.ts (1)
createSuccessResponse(12-24)
backend/src/modules/ai/config/models.ts (1)
backend/src/modules/ai/types/models.types.ts (1)
ModelConfig(6-12)
backend/src/modules/ai/services/summarization.service.ts (6)
backend/src/shared/config/environment.ts (1)
env(28-64)backend/src/shared/errors/index.ts (1)
AppError(1-11)backend/src/modules/ai/config/prompts.ts (2)
SUMMARY_SYSTEM_PROMPT(1-26)SUMMARY_USER_PROMPT(28-32)backend/src/modules/ai/config/models.ts (1)
MODEL_CONFIGS(3-18)backend/src/modules/ai/types/summarization.types.ts (2)
SummarizationRequest(18-25)SummarizationResponse(27-34)backend/src/shared/utils/logger.ts (1)
logger(28-49)
🪛 Biome (1.9.4)
backend/tests/user-module.spec.js
[error] 71-71: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
backend/tests/user-module.spec.ts
[error] 74-74: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
🔇 Additional comments (32)
backend/.env.sample (1)
36-39: LGTM! Environment configuration properly structured.The addition of the
GEMINI_API_KEYenvironment variable is correctly placed and follows the established formatting pattern. This supports the new AI summarization functionality.backend/src/shared/utils/logger.ts (2)
14-26: Excellent logging format separation!The separation of console and file logging formats is a great improvement:
- Console logs remain colorized for better developer experience
- File logs are uncolorized to prevent ANSI escape codes in log files
- Clean separation of concerns with dedicated format objects
30-48: Proper format assignment to transports.The format assignment is correctly implemented with
consoleFormatfor terminal output andfileFormatfor file outputs. MovingexitOnErrorabove the transports array also improves readability.backend/src/app.ts (2)
18-18: AI module integration enabled.The uncommented import properly integrates the new TypeScript-based AI summarization module.
63-63:❓ Verification inconclusive
Verify AI endpoint functionality.
The AI routes are now mounted at
/aipath. Ensure the new summarization endpoint/ai/summarizefunctions correctly with the Gemini API integration.Run the following script to verify the AI module structure and endpoint registration:
🏁 Script executed:
#!/bin/bash # Description: Verify AI module structure and route registration # Check if AI module files exist echo "=== Checking AI module structure ===" fd -t f -e ts . backend/src/modules/ai # Check for summarization route definition echo -e "\n=== Checking summarization routes ===" cat backend/src/modules/ai/routes/summarization.routes.ts # Verify Gemini service implementation echo -e "\n=== Checking Gemini service ===" rg -A 10 "class.*Service" backend/src/modules/ai/services/Length of output: 2274
Verify AI Summarization Endpoint Registration and Integration
The AI summarization route is correctly wired up:
app.use('/ai', summarizationRoutes);inbackend/src/app.tsPOST /summarizedefined inbackend/src/modules/ai/routes/summarization.routes.tsSummarizationServiceinbackend/src/modules/ai/services/summarization.service.tschecks forGEMINI_API_KEYand initializes the Gemini modelPlease perform an end-to-end test of
POST /ai/summarizewith a valid payload to confirm:
- Request validation against
SummarizationRequestSchema- Successful invocation of the Gemini API
- Proper error handling when the API key is missing or invalid
backend/src/modules/ai/types/index.ts (1)
1-2: LGTM! Clean barrel export pattern.The re-export structure is well-organized and follows TypeScript best practices for centralizing type definitions. This makes imports cleaner throughout the AI module.
backend/jest.config.js (1)
14-17: Good addition for test performance.Adding
testPathIgnorePatternsto exclude/node_modules/and/dist/is a best practice that prevents Jest from discovering tests in build artifacts and dependencies, improving test performance.backend/src/modules/ai/controllers/summarization.controller.ts (1)
20-24:✅ Verification successful
Review logging for potential PII exposure.
The current logging includes the full text length, which is generally safe. However, ensure that no actual text content or other sensitive information gets logged inadvertently in future modifications.
Current logging looks secure, but verify that the
SummarizationServicereferenced in line 2 and theloggerutility don't log request content:
🏁 Script executed:
#!/bin/bash # Search for potential PII logging in related service and logger files echo "Checking for potential PII exposure in logging..." # Check the summarization service for any text logging rg -A 3 -B 3 "logger\.(info|debug|warn|error)" backend/src/modules/ai/services/ rg -A 3 -B 3 "console\.(log|info|debug|warn|error)" backend/src/modules/ai/services/ # Check logger configuration for sensitive data exposure rg -A 5 -B 5 "(req\.body|request\.text|requestData\.text)" backend/src/shared/utils/logger.tsLength of output: 946
Logging PII Check Complete
I’ve verified that:
- The controller only logs metadata (
textLength,processingTime,model), not the actual text.- The
SummarizationServiceonly logs errors (logger.error('Error in summarization service', { error })) without including request content.- The shared
loggerutility doesn’t referencereq.body,request.text, orrequestData.text.No PII exposure risk detected—no further changes needed.
backend/tests/user-module.spec.js (3)
52-54: Good addition for cookie-based authentication.Adding
withCredentials: trueenables proper cookie handling for authentication flows, which aligns well with the enhanced cookie checking logic below.
56-69: Excellent backward compatibility approach.The flexible assertions make the test more robust by supporting both token-based and cookie-based authentication responses. This approach ensures the test works across different authentication implementations.
147-152: Good practice for test cleanup.The
afterAllcleanup hook with a delay ensures all asynchronous operations complete properly, which helps prevent test suite issues like hanging handles.backend/src/modules/ai/config/prompts.ts (2)
1-26: Well-structured system prompt with comprehensive guidelines.The system prompt provides clear, detailed instructions covering format, style, content restrictions, and quality requirements. The four-section structure (FORMAT, STYLE, AVOID, ENSURE) makes it easy to understand and maintain.
28-32: Clean user prompt template.The user prompt effectively references the system guidelines and uses a clear placeholder for text substitution. The reminder about clarity and conciseness reinforces the system prompt instructions.
backend/package.json (4)
7-7: Good addition of dedicated build script.Adding a separate
buildscript improves the development workflow by allowing explicit TypeScript compilation without running the application.
12-12: Useful verbose testing option.The
test:verbosescript provides better debugging capabilities for test development and troubleshooting.
37-38:✅ Verification successful
Verify compatibility and security of new AI dependencies.
The addition of
@google/generative-aiand@langchain/coresupports the new TypeScript-based AI summarization functionality. Please ensure these versions are compatible and free from known vulnerabilities.
🌐 Web query:
Are there any known security vulnerabilities in @google/generative-ai version 0.24.1 and @langchain/core version 0.3.57?💡 Result:
Security Vulnerabilities in @google/generative-ai v0.24.1
Based on the available security databases and package repositories, there are no known direct security vulnerabilities reported for
@google/generative-aiversion 0.24.1 as of the current date[1][2]. Both Snyk and jsDelivr list this version as having zero reported vulnerabilities. The changelog for the package also does not mention any security-related patches or advisories for this version[3].Security Vulnerabilities in @langchain/core v0.3.57
There are currently no publicly disclosed vulnerabilities for
@langchain/coreversion 0.3.57 in the provided search results or major vulnerability databases[9]. Known critical vulnerabilities affecting the broader LangChain ecosystem (such as CVE-2023-46229 and CVE-2023-44467) affect Python versions prior to 0.0.317 and do not apply to@langchain/coreJavaScript/TypeScript package, especially not to the 0.3.x series[10].Importantly, a recently disclosed moderate-severity file disclosure vulnerability (CVE-2024-10940) affects the Python package
langchain-corein versions>=0.3.0,<0.3.15[7]. However, your queried version (0.3.57) is well outside the affected range, and is therefore not vulnerable to this issue[7].Summary Table
Package Version Known Vulnerabilities? Notes @google/generative-ai 0.24.1 No No reported vulnerabilities as of this date[1][2][3]. @langchain/core (JavaScript) 0.3.57 No No reported vulnerabilities for this version[9]; Python issues do not apply. Conclusion
- @google/generative-ai@0.24.1: No known security vulnerabilities.
- @langchain/core@0.3.57: No known security vulnerabilities for this JavaScript/TypeScript package and version; all known critical vulnerabilities apply either to older Python versions or versions outside your range.
It is always recommended to regularly check the official advisories and security databases for updates related to the packages in use.
Citations:
- 1: https://security.snyk.io/package/npm/@google%2Fgenerative-ai/0.24.1
- 2: https://www.jsdelivr.com/package/npm/@google/generative-ai
- 3: https://github.com/google-gemini/generative-ai-js/blob/main/CHANGELOG.md
- 4: https://cloud.google.com/vertex-ai/generative-ai/docs/security-bulletins
- 5: https://security.googleblog.com/2024/11/
- 6: https://vulert.com/vuln-db/pypi-langchain-124988
- 7: GHSA-5chr-fjjv-38qv
- 8: vercel/ai#3205
- 9: https://www.npmjs.com/package/@langchain/core
- 10: https://www.csa.gov.sg/alerts-and-advisories/alerts/al-2024-092
Security Check Passed for New AI Dependencies
No known security vulnerabilities have been reported for
@google/generative-ai@0.24.1or@langchain/core@0.3.57. These versions are considered safe to use—no further action required. Keep monitoring official advisories for future updates.
41-41:❓ Verification inconclusive
Verify updated dependency versions.
The axios upgrade from 1.7.7 to 1.9.0 and stripe update should be verified for breaking changes and security improvements.
Also applies to: 53-53
🌐 Web query:
What are the breaking changes and security improvements in axios 1.9.0 compared to 1.7.7?💡 Result:
Axios version 1.9.0 introduces several important security improvements compared to version 1.7.7. Notably, it addresses the Server-Side Request Forgery (SSRF) vulnerability identified as CVE-2025-27152. This vulnerability allowed attackers to send requests to unintended internal hosts, potentially leading to unauthorized data access. The fix involves setting the
allowAbsoluteUrlsattribute tofalseby default when processing requested URLs, thereby mitigating the risk of SSRF attacks. (security.snyk.io)Additionally, Axios 1.9.0 resolves the Cross-Site Request Forgery (CSRF) vulnerability (CVE-2023-45857), which previously allowed the
X-XSRF-TOKENheader to be inserted using theXSRF-TOKENcookie value in all requests when theXSRF-TOKENcookie was available and thewithCredentialssetting was enabled. This could potentially lead to the bypass of CSRF defenses. The fix ensures that theX-XSRF-TOKENheader is only included in requests to the same origin, enhancing the security against CSRF attacks. (github.com)Regarding breaking changes, the available information does not specify any introduced in version 1.9.0. However, it's important to note that Axios has previously addressed vulnerabilities in earlier versions, such as the Inefficient Regular Expression Complexity vulnerability (CVE-2021-3749) in version 0.19.0. Users are encouraged to review the Axios changelog for detailed information on all updates and potential breaking changes. (github.com)
For a comprehensive list of security improvements and any breaking changes, please refer to the Axios changelog.
Citations:
- 1: https://security.snyk.io/package/npm/axios/0.29.0?utm_source=openai
- 2: https://github.com/axios/axios/blob/v1.x/CHANGELOG.md?utm_source=openai
- 3: https://github.com/axios/axios/blob/v1.x/CHANGELOG.md?utm_source=openai
Verify Dependency Upgrades
Please confirm that both the Axios and Stripe version bumps do not introduce unintended behavior or regressions:
Axios 1.9.0
- Security fixes:
- SSRF (CVE-2025-27152):
allowAbsoluteUrlsnow defaults tofalse.- CSRF (CVE-2023-45857):
X-XSRF-TOKENheader only sent to same-origin requests.- No breaking changes are explicitly documented, but review the full changelog for any minor behavior adjustments:
https://github.com/axios/axios/blob/v1.x/CHANGELOG.mdStripe
- Please review the Stripe package’s changelog or release notes for any breaking changes and security improvements introduced by the version bump.
backend/src/modules/ai/types/summarization.types.ts (4)
5-16: Robust validation schema with appropriate constraints.The Zod schema provides comprehensive validation with:
- Text length limits (1-10,000 characters) to prevent abuse
- Optional configuration with sensible defaults
- Proper enum validation for format and model
- Express-compatible structure with body/query/params
18-25: Clean and consistent type definitions.The TypeScript types align well with the Zod schema and provide clear contracts for the API consumers.
27-40: Well-designed response and error interfaces.The response interface includes useful metadata (processing time, token count) for monitoring and debugging. The error interface follows standard error handling patterns with optional details for flexibility.
2-2:✅ Verification successful
Verify the models.types import exists.
The import references a relative path to models.types, which should contain the AIModel enum definition.
🏁 Script executed:
#!/bin/bash # Description: Verify that the AIModel type exists in the models.types file # Expected: Find the AIModel definition in the models.types file fd -e ts -x grep -l "AIModel" {} \; | head -10 ast-grep --pattern 'export enum AIModel { $$$ }'Length of output: 590
Import of AIModel Verified
TheAIModelenum is exported frombackend/src/modules/ai/types/models.types.ts, so the import path./models.typesinsummarization.types.tsis correct.backend/tests/user-module.spec.ts (9)
2-2: Good addition of afterAll import for test cleanup.The import of
afterAllenables proper test cleanup, which is used later in the file.
46-49: Good improvement for cookie-based authentication.Adding
withCredentials: trueproperly enables cookie handling for authentication, which aligns with modern authentication patterns.
112-116: Smart guard clause to prevent cascading test failures.This prevents the profile update test from failing if the registration test didn't succeed, making the test suite more resilient.
128-129: Good timeout increase for database propagation.Increasing the timeout from 500ms to 1000ms allows more time for database changes to propagate, reducing flaky test failures.
140-141: Improved error handling for test continuity.Converting throwing errors to logged warnings allows subsequent tests to continue even if profile updates fail, making the test suite more resilient.
147-151: Consistent guard clause implementation.Applying the same guard pattern as the profile update test maintains consistency and prevents cascading failures.
163-164: Consistent timeout handling.Using the same 1000ms timeout as the profile update test ensures consistent database propagation time.
174-175: Consistent error handling pattern.Maintaining the same non-throwing error handling pattern as the profile update test ensures test suite continuity.
179-184: Proper test cleanup implementation.The
afterAllhook with delay ensures all asynchronous operations complete before test teardown, preventing resource leaks.backend/src/modules/ai/services/summarization.service.ts (2)
15-32: Good initialization with proper API key validation.The constructor properly validates the Gemini API key and initializes the model with system instructions and configuration. The error handling provides clear guidance when the API key is missing.
65-71: Good error handling with proper logging.The error handling properly logs errors, preserves
AppErrorinstances, and wraps unknown errors appropriately. This provides good debugging capabilities while maintaining consistent error interfaces.
| it('should return 400 for invalid input', async () => { | ||
| try { | ||
| await axios.post(`${API_URL}/ai/summarize`, {}); // Missing required content | ||
| // If we reach here, the test should fail | ||
| expect(true).toBe(false); // This line should not be reached | ||
| } catch (error: any) { | ||
| expect(error.response.status).toBe(400); | ||
| expect(error.response.data).toHaveProperty('error'); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve negative test structure.
The current pattern with unreachable code (line 28) is confusing and not a best practice for testing expected failures.
it('should return 400 for invalid input', async () => {
- try {
- await axios.post(`${API_URL}/ai/summarize`, {}); // Missing required content
- // If we reach here, the test should fail
- expect(true).toBe(false); // This line should not be reached
- } catch (error: any) {
- expect(error.response.status).toBe(400);
- expect(error.response.data).toHaveProperty('error');
- }
+ await expect(
+ axios.post(`${API_URL}/ai/summarize`, {}) // Missing required content
+ ).rejects.toMatchObject({
+ response: {
+ status: 400,
+ data: expect.objectContaining({
+ error: expect.any(String)
+ })
+ }
+ });
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it('should return 400 for invalid input', async () => { | |
| try { | |
| await axios.post(`${API_URL}/ai/summarize`, {}); // Missing required content | |
| // If we reach here, the test should fail | |
| expect(true).toBe(false); // This line should not be reached | |
| } catch (error: any) { | |
| expect(error.response.status).toBe(400); | |
| expect(error.response.data).toHaveProperty('error'); | |
| } | |
| }); | |
| it('should return 400 for invalid input', async () => { | |
| await expect( | |
| axios.post(`${API_URL}/ai/summarize`, {}) // Missing required content | |
| ).rejects.toMatchObject({ | |
| response: { | |
| status: 400, | |
| data: expect.objectContaining({ | |
| error: expect.any(String) | |
| }) | |
| } | |
| }); | |
| }); |
🤖 Prompt for AI Agents
In backend/tests/ai-module.spec.ts around lines 24 to 33, the test for invalid
input uses a try-catch block with an unreachable assertion that forces failure
if no error is thrown, which is confusing. Refactor the test to use the testing
framework's built-in methods for asserting promise rejections, such as expecting
the axios.post call to reject with a 400 status, removing the need for manual
try-catch and unreachable code.
| describe('Summarization Route Tests', () => { | ||
| it('should return a summary for valid input', async () => { | ||
| try { | ||
| const response = await axios.post(`${API_URL}/ai/summarize`, { | ||
| text: 'This is a long text that needs summarizing. It contains multiple sentences and should be long enough to be summarized effectively by the AI model. The summary should capture the main points while being concise and clear.', | ||
| options: { model: 'gemini-2.0-flash' } | ||
| }); | ||
|
|
||
| expect(response.status).toBe(200); | ||
| expect(response.data).toHaveProperty('data'); | ||
| expect(response.data.data).toHaveProperty('summary'); | ||
| expect(typeof response.data.data.summary).toBe('string'); | ||
| } catch (error: any) { | ||
| console.error('Test failed with error:', error.response?.data || error.message); | ||
| throw error; | ||
| } | ||
| }); | ||
|
|
||
| it('should return 400 for invalid input', async () => { | ||
| try { | ||
| await axios.post(`${API_URL}/ai/summarize`, {}); // Missing required content | ||
| // If we reach here, the test should fail | ||
| expect(true).toBe(false); // This line should not be reached | ||
| } catch (error: any) { | ||
| expect(error.response.status).toBe(400); | ||
| expect(error.response.data).toHaveProperty('error'); | ||
| } | ||
| }); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider adding server lifecycle management.
These integration tests assume the server is running, which may not be reliable in CI/CD environments. Consider adding setup/teardown to start/stop the server or use supertest for more reliable testing.
Would you like me to verify if there's existing server lifecycle management in the test setup, or if supertest is being used elsewhere in the project?
🏁 Script executed:
#!/bin/bash
# Check for existing test setup patterns and supertest usage
echo "=== Checking for test setup files ==="
fd -e js -e ts -x grep -l "beforeAll\|afterAll\|setupFiles" {} \;
echo -e "\n=== Checking for supertest usage ==="
rg -A 3 "import.*supertest|require.*supertest"
echo -e "\n=== Checking package.json for supertest dependency ==="
fd package.json -x grep -A 5 -B 5 "supertest" {} \;Length of output: 545
🏁 Script executed:
#!/bin/bash
# Display the first 200 lines of the user-module spec to check server lifecycle management
sed -n '1,200p' backend/tests/user-module.spec.tsLength of output: 7767
Add server lifecycle management to your integration tests
The AI summarization tests in backend/tests/ai-module.spec.ts (and similarly the user-module tests in backend/tests/user-module.spec.ts) currently assume an external server is already running at http://localhost:3000. This can lead to flaky CI builds. To make your tests self-contained, either:
- Import and start your app server in a
beforeAllhook and close it inafterAll; or - Switch to using
supertestso you can invoke routes directly without relying on an external process.
Example using supertest:
import request from 'supertest';
import app from '../../src/app'; // your Express/Fastify/Koa instance
let server: any;
beforeAll(() => { server = app.listen(0); });
afterAll(async () => { await server.close(); });
describe('Summarization Route Tests', () => {
it('should return a summary for valid input', async () => {
const response = await request(server)
.post('/ai/summarize')
.send({ text: '…', options: { model: 'gemini-2.0-flash' } });
expect(response.status).toBe(200);
expect(response.body.data.summary).toEqual(expect.any(String));
});
it('should return 400 for invalid input', async () => {
const response = await request(server)
.post('/ai/summarize')
.send({});
expect(response.status).toBe(400);
expect(response.body).toHaveProperty('error');
});
});• File(s) needing updates:
- backend/tests/ai-module.spec.ts
- backend/tests/user-module.spec.ts
🤖 Prompt for AI Agents
In backend/tests/ai-module.spec.ts lines 6 to 34, the tests currently assume the
server is already running at a fixed URL, which can cause flaky CI builds. To
fix this, add server lifecycle management by importing your app instance and
starting the server in a beforeAll hook, then closing it in afterAll.
Alternatively, replace axios calls with supertest requests directly to the app
server instance, removing reliance on an external process. This makes tests
self-contained and more reliable.
| async summarize(request: SummarizationRequest): Promise<SummarizationResponse> { | ||
| const startTime = Date.now(); | ||
| const model = request.options?.model || AIModel.Gemini15Flash; | ||
|
|
||
| try { | ||
| // Format the prompt with system and user messages combined | ||
| const formattedPrompt = await this.prompt.format({ | ||
| text: request.text, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add input validation for text length.
The service should validate the input text length to prevent API errors and excessive costs from very large inputs.
async summarize(request: SummarizationRequest): Promise<SummarizationResponse> {
const startTime = Date.now();
const modelType = request.options?.model || AIModel.Gemini15Flash;
+
+ // Validate input text length
+ if (!request.text || request.text.trim().length === 0) {
+ throw new AppError(400, 'Text to summarize cannot be empty', 'summarization.service');
+ }
+
+ // Check for reasonable text length limits (e.g., 100KB)
+ if (request.text.length > 100000) {
+ throw new AppError(400, 'Text too long for summarization (max 100KB)', 'summarization.service');
+ }Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/src/modules/ai/services/summarization.service.ts around lines 34 to
42, add validation to check the length of request.text before processing. If the
text exceeds a defined maximum length, throw an error or return a validation
response to prevent API errors and control costs. Implement this check early in
the summarize method to avoid unnecessary processing of overly large inputs.
| const encoding = getEncoding('cl100k_base'); | ||
| const promptTokens = encoding.encode(formattedPrompt).length; | ||
| const summaryTokens = encoding.encode(summary).length; | ||
| const tokenCount = promptTokens + summaryTokens; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Consider using Gemini-specific tokenizer for accurate token counting.
Using OpenAI's cl100k_base tokenizer for Gemini models may provide inaccurate token counts since different models use different tokenization strategies.
🌐 Web query:
Does Google Gemini have its own tokenizer library for accurate token counting in JavaScript/TypeScript?
💡 Result:
As of May 2025, Google has not released a dedicated tokenizer library for JavaScript or TypeScript specifically tailored for the Gemini model. However, there are alternative solutions available:
-
@lenml/tokenizer-gemini: This npm package offers a tokenizer based on
@lenml/tokenizers. It provides functions likeencode()and_encode_text()for tokenization tasks. (npmjs.com) -
gpt-tokenizer: A pure JavaScript implementation of a Byte Pair Encoding (BPE) tokenizer compatible with OpenAI's models, including GPT-2, GPT-3, and GPT-4. It's designed for efficient tokenization in JavaScript environments. (jspm-packages.deno.dev)
Additionally, Google's official Node.js/TypeScript library for the Gemini API, @google/genai, is available. While it doesn't include a tokenizer, it offers functionalities to interact with Gemini models. (googleapis.github.io)
For a tokenizer specifically designed for Google's models, you might consider SentencePiece, a language-independent subword tokenizer and detokenizer. Although primarily implemented in C++ and Python, it can be integrated into JavaScript/TypeScript projects through bindings or by using its pre-trained models. (arxiv.org)
In summary, while there's no official Google Gemini tokenizer for JavaScript/TypeScript, the above alternatives can assist with tokenization tasks in your projects.
Citations:
- 1: https://www.npmjs.com/package/%40lenml/tokenizer-gemini?utm_source=openai
- 2: https://jspm-packages.deno.dev/package/gpt-tokenizer%402.9.0?utm_source=openai
- 3: https://googleapis.github.io/js-genai/index.html?utm_source=openai
- 4: https://arxiv.org/abs/1808.06226?utm_source=openai
Use a Gemini-compatible tokenizer for accurate token counts
OpenAI’s cl100k_base encoding can miscount tokens for Gemini models since they use a different BPE scheme. As of May 2025, there’s no official Google-released JS/TS tokenizer for Gemini, but you can choose one of these community-maintained options:
@lenml/tokenizer-gemini(npm)gpt-tokenizer(pure-JS BPE)- SentencePiece via JS bindings
Please update backend/src/modules/ai/services/summarization.service.ts to use one of these instead of getEncoding('cl100k_base'). For example:
-import { getEncoding } from 'some-tokenizer-lib';
+import { encode as geminiEncode } from '@lenml/tokenizer-gemini';
const promptTokens = /* getEncoding('cl100k_base').encode(formattedPrompt) */ geminiEncode(formattedPrompt).length;
const summaryTokens = /* getEncoding('cl100k_base').encode(summary) */ geminiEncode(summary).length;
const tokenCount = promptTokens + summaryTokens;Run a few samples through the Gemini API to validate your counts.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/src/modules/ai/services/summarization.service.ts around lines 52 to
55, replace the use of OpenAI's 'cl100k_base' tokenizer with a Gemini-compatible
tokenizer to ensure accurate token counting for Gemini models. Choose a
community-supported tokenizer such as '@lenml/tokenizer-gemini' or
'gpt-tokenizer', install it as a dependency, and update the code to use its
encode function for tokenizing the prompt and summary. After implementing,
validate token counts by running sample inputs through the Gemini API to confirm
accuracy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/tests/user-module.spec.ts (1)
52-87: Improve cookie checking logic with optional chaining.The test improvements make it more flexible and robust. However, the cookie checking logic can be simplified using optional chaining.
Apply this diff to simplify the cookie checking logic:
- // Check for cookies if they're being set - if (response.headers && response.headers['set-cookie']) { - const cookies = response.headers['set-cookie']; + // Check for cookies if they're being set + if (response.headers?.['set-cookie']) { + const cookies = response.headers['set-cookie'];🧰 Tools
🪛 Biome (1.9.4)
[error] 74-74: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/src/modules/ai/services/summarization.service.ts(1 hunks)backend/tests/ai-module.spec.ts(1 hunks)backend/tests/user-module.spec.ts(6 hunks)
✅ Files skipped from review due to trivial changes (1)
- backend/tests/ai-module.spec.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/src/modules/ai/services/summarization.service.ts
🧰 Additional context used
🪛 Biome (1.9.4)
backend/tests/user-module.spec.ts
[error] 74-74: Change to an optional chain.
Unsafe fix: Change to an optional chain.
(lint/complexity/useOptionalChain)
🔇 Additional comments (9)
backend/tests/user-module.spec.ts (9)
2-2: Good addition for test cleanup.Adding the
afterAllimport enables proper test cleanup, which is used later in the file.
5-5: Excellent configuration flexibility.Using an environment variable for the API URL makes the tests more flexible across different environments.
15-15: Appropriate ESLint suppression.The eslint disable comment is justified since
authTokenis intentionally unused but may be needed for future test enhancements.
46-49: Good addition for cookie-based authentication.Adding
withCredentials: trueproperly enables cookie handling for authentication, which aligns with the subsequent cookie verification logic.
112-116: Excellent defensive programming.Adding the null check for
userIdprevents test failures when registration doesn't succeed, making the test suite more robust.
128-142: Good timeout and error handling improvements.Increasing the timeout to 1000ms and logging errors without failing helps address test flakiness while maintaining visibility into issues.
147-151: Consistent defensive programming.The null check pattern is consistently applied across update tests, which is good practice.
163-176: Consistent improvements in error handling.The same timeout and error handling pattern is applied consistently across similar test scenarios.
179-184: Proper test cleanup implementation.The
afterAllhook with a delay ensures asynchronous operations complete before test teardown, which is a good practice for preventing resource leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 & GTM
*note - did not integrate with rest of site because caching(save to pg db in content) needs to be done on generated summaries to not destroy our ai bill on every reload.
Summary by CodeRabbit
New Features
/ai/summarizeendpoint.Improvements
Chores
Bug Fixes