Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support PostgreSQL #733

Merged
merged 15 commits into from
Nov 30, 2024
Merged

feat: support PostgreSQL #733

merged 15 commits into from
Nov 30, 2024

Conversation

fengmk2
Copy link
Member

@fengmk2 fengmk2 commented Nov 24, 2024

closes #731

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for PostgreSQL alongside MySQL, including a new database setup script and comprehensive documentation for local development.
    • Introduced a new CI job for PostgreSQL testing in the GitHub Actions workflow.
    • Enhanced the README.md and DEVELOPER.md files to provide clearer instructions for using both database systems.
    • Added new environment variable configurations for PostgreSQL in the Docker deployment documentation.
  • Bug Fixes

    • Improved error handling in tests for duplicate entries to accommodate both MySQL and PostgreSQL error messages.
  • Documentation

    • Updated setup instructions for PostgreSQL and clarified MySQL setup in the documentation.
    • Enhanced contributor information in the README.
    • Expanded instructions for setting up Elasticsearch and Kibana, including environment variable configurations.
  • Chores

    • Updated package dependencies to include PostgreSQL client libraries and modified scripts to support both databases.
    • Changed the base image in the Dockerfile to a newer Node.js version.

@fengmk2 fengmk2 added the enhancement New feature or request label Nov 24, 2024
@fengmk2 fengmk2 requested a review from elrrrrrrr November 24, 2024 09:30
Copy link
Contributor

coderabbitai bot commented Nov 24, 2024

Walkthrough

The pull request introduces significant enhancements to the project's database capabilities by adding support for PostgreSQL alongside existing MySQL functionality. A new GitHub Actions job for testing with PostgreSQL is included, along with updates to documentation and configuration files to guide developers in setting up their environments for either database. The codebase has been modified to improve error handling and streamline database interactions, ensuring compatibility with PostgreSQL. New scripts for database preparation and comprehensive SQL schemas for both MySQL and PostgreSQL have also been added.

Changes

File Change Summary
.github/workflows/nodejs.yml Added a new job test-postgresql-fs-nfs for PostgreSQL testing in CI.
DEVELOPER.md Updated documentation to include PostgreSQL setup instructions and clarify dependencies.
README.md Modified contributor section and updated text for clarity.
app/core/service/PackageManagerService.ts Enhanced error handling in publish method and updated logic for package publishing.
app/core/service/PackageVersionFileService.ts Improved error handling for duplicate entries using a utility function.
app/port/config.ts Added database property to CnpmcoreConfig type definition.
app/repository/PackageRepository.ts Replaced getCountSql with getTotalCountByModel for better database handling.
app/repository/TaskRepository.ts Updated error handling and data manipulation using utility functions.
app/repository/util/ErrorUtil.ts Introduced isDuplicateKeyError function for better error handling.
config/config.default.ts Updated database configuration to use a centralized object for connection parameters.
config/config.unittest.ts Adjusted database configuration for unit tests to utilize new database settings.
config/database.ts Added DATABASE_TYPE enum and improved database connection setup.
docs/setup-with-postgresql.md New documentation for setting up PostgreSQL and Redis locally.
docs/setup.md Updated setup documentation to specify MySQL as the database.
package.json Added PostgreSQL-related scripts and dependencies, modified existing scripts for compatibility.
prepare-database-mysql.sh Updated environment variable prefixes for MySQL database preparation.
prepare-database-postgresql.sh New script for automating PostgreSQL database preparation.
sql/ddl_mysql.sql Created multiple new tables for MySQL schema.
sql/ddl_postgresql.sql Created comprehensive PostgreSQL schema with multiple tables and indexes.
sql/postgresql/3.68.0.sql Added detailed PostgreSQL schema definitions for version 3.68.0.
test/TestUtil.ts Refactored to support multiple database types in testing utilities.
test/port/controller/PackageVersionFileController/listFiles.test.ts Updated tests to handle database type conditions.
test/port/controller/TokenController/createToken.test.ts Improved assertions for duplicate token names to accommodate both MySQL and PostgreSQL.
test/repository/ChangeRepository.test.ts Adjusted ID handling in tests to account for PostgreSQL behavior.

Assessment against linked issues

Objective Addressed Explanation
Support for PostgreSQL and SQLite (731)
Enhance local deployment ease (731)

Possibly related PRs

  • chore: brew install mysql v9 #713: Updates MySQL installation instructions, relevant to the overall database setup context but not directly connected to the PostgreSQL testing job.

Poem

🐰 In the garden where databases bloom,
PostgreSQL joins the MySQL room.
With scripts and docs, we pave the way,
For bunnies to code and play all day!
New tables sprout, and errors fade,
A joyful leap in the code parade!


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

socket-security bot commented Nov 24, 2024

New and removed dependencies detected. Learn more about Socket for GitHub ↗︎

Package New capabilities Transitives Size Publisher
npm/@types/pg@8.11.10 None +4 2.32 MB types
npm/pg@8.13.1 environment, network +2 283 kB brianc

View full report↗︎

README.md Outdated Show resolved Hide resolved
@fengmk2
Copy link
Member Author

fengmk2 commented Nov 24, 2024

@elrrrrrrr 哈哈,pg 比 mysql 还容易学

Copy link

codecov bot commented Nov 24, 2024

Codecov Report

Attention: Patch coverage is 96.29630% with 4 lines in your changes missing coverage. Please review.

Project coverage is 96.39%. Comparing base (6de0876) to head (dae1814).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
config/database.ts 90.90% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #733      +/-   ##
==========================================
- Coverage   96.40%   96.39%   -0.01%     
==========================================
  Files         189      191       +2     
  Lines       18919    18995      +76     
  Branches     2485     2498      +13     
==========================================
+ Hits        18238    18310      +72     
- Misses        681      685       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

config/database.ts Dismissed Show dismissed Hide dismissed
test/TestUtil.ts Dismissed Show dismissed Hide dismissed
@fengmk2
Copy link
Member Author

fengmk2 commented Nov 26, 2024

还需要加一个 docker file 支持 pg 环境的

@fengmk2 fengmk2 marked this pull request as ready for review November 30, 2024 08:35
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 25

🧹 Outside diff range and nitpick comments (33)
test/TestUtil.ts (4)

71-74: Ensure consistent handling of query results between MySQL and PostgreSQL.

The added conditional check for rows.rows is a good approach to accommodate PostgreSQL's result format. However, consider explicitly checking the database type for clarity.

For example:

 if (err) {
   return reject(err);
 }
-if (rows.rows) {
+if (config.type === DATABASE_TYPE.PostgreSQL) {
   // pg: { rows }
   return resolve(rows.rows);
 }
 return resolve(rows);

86-89: Ensure type safety when casting configuration objects.

Casting config to any may bypass TypeScript's type checks. Consider defining appropriate interface types for the database configurations to maintain type safety.

For example, define interfaces for MySQL and PostgreSQL configurations and use them accordingly:

interface MySqlConfig {
  // define properties
}

interface PostgreSqlConfig {
  // define properties
}
🧰 Tools
🪛 Biome (1.9.4)

[error] 88-88: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


122-124: Adjust truncateDatabase method to use class name instead of this.

Continue the practice of referencing the class name in static methods for consistency.

Update the method as follows:

-static async truncateDatabase() {
-  const tables = await this.getTableNames();
+static async truncateDatabase() {
+  const tables = await TestUtil.getTableNames();
   await Promise.all(tables.map(async (table: string) => {
     await TestUtil.query(`TRUNCATE TABLE ${table};`);
   }));
 }
🧰 Tools
🪛 Biome (1.9.4)

[error] 122-123: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


Line range hint 67-124: Consider defining types for database query results.

Currently, the methods are using any for query results, which can lead to runtime errors. Defining types or interfaces for the expected query results enhances type safety and code readability.

For example:

type QueryResult = {
  // define expected properties
};
🧰 Tools
🪛 Biome (1.9.4)

[error] 67-67: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 83-83: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 84-84: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 88-88: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 91-91: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)

app/repository/util/ErrorUtil.ts (2)

1-10: Add JSDoc documentation and improve error handling

The function would benefit from proper documentation and more robust error handling.

+/**
+ * Checks if an error is a duplicate key violation error from MySQL or PostgreSQL
+ * @param err - The error object to check
+ * @returns true if the error is a duplicate key violation, undefined otherwise
+ */
 export function isDuplicateKeyError(err: Error & { code?: string }) {
+  if (!err) return false;
+
   if (err.code === 'ER_DUP_ENTRY') {
     return true;
   }
-  if (err.message.includes('duplicate key value violates unique constraint')) {
+  if (err.message?.includes('duplicate key value violates unique constraint')) {
     // pg: duplicate key value violates unique constraint "tasks_uk_task_id"
     // code: '23505'
     return true;
   }
+  return false;
 }
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 3-4: app/repository/util/ErrorUtil.ts#L3-L4
Added lines #L3 - L4 were not covered by tests


5-9: Consider adding PostgreSQL error code check

The PostgreSQL error handling could be more specific by also checking the error code '23505'.

   if (err.message?.includes('duplicate key value violates unique constraint')) {
     // pg: duplicate key value violates unique constraint "tasks_uk_task_id"
     // code: '23505'
+    if (err.code === '23505') {
       return true;
+    }
   }
docs/setup.md (3)

Line range hint 1-8: Update documentation to reflect multi-database support

The documentation needs to be updated to reflect the newly added PostgreSQL support. The title and dependency section should not be MySQL-specific anymore.

Apply this diff to update the documentation:

-# 本地开发环境 - MySQL
+# 本地开发环境
 
 ## 依赖列表
 
 本项目的外部服务依赖有:
 
-数据库:MySQL
+数据库:支持以下选项
+  - MySQL
+  - PostgreSQL
+  - SQLite(适用于本地开发)
 缓存:Redis

Line range hint 10-19: Clarify database choice in Docker setup

The Docker setup instructions should guide users on how to choose and configure their preferred database.

Add instructions for database selection:

 ## 通过 Docker 初始化
 
+选择数据库类型(设置 `DB_TYPE` 环境变量):
+```bash
+# MySQL
+$ export DB_TYPE=mysql
+
+# PostgreSQL
+$ export DB_TYPE=postgres
+
+# SQLite
+$ export DB_TYPE=sqlite
+```
+
 ```bash
 # 启动
 $ docker-compose up -d

Line range hint 21-35: Fix MySQL version and add PostgreSQL manual setup

There are several issues in the manual setup section:

  1. MySQL 9 doesn't exist (latest version is 8.x)
  2. Missing PostgreSQL setup instructions
  3. Missing SQLite setup instructions

Replace the manual setup section with:

 ## 手动初始化
 
 假设大家使用 macOS 开发,Linux 和 Windows 环境自行参考。
 
-### MySQL 9
+### 选择数据库
+
+#### MySQL 8
 
 ```bash
 brew install mysql
 brew services start mysql

+#### PostgreSQL
+
+bash +brew install postgresql@14 +brew services start postgresql@14 +
+
+#### SQLite
+
+bash +brew install sqlite +
+
+### 数据库初始化
+
+根据选择的数据库类型,运行相应的初始化脚本:
+
+bash +# MySQL +npm run db:init:mysql + +# PostgreSQL +npm run db:init:postgres + +# SQLite +npm run db:init:sqlite +


</blockquote></details>
<details>
<summary>prepare-database-mysql.sh (3)</summary><blockquote>

`4-8`: **Consider adding environment variable validation**

While the variable renaming aligns well with the PostgreSQL support objective, consider adding validation for required variables to prevent silent failures.

```diff
 #!/bin/bash
 
+# Validate required environment variables
+required_vars=("CNPMCORE_DATABASE_HOST" "CNPMCORE_DATABASE_PORT" "CNPMCORE_DATABASE_USER" "CNPMCORE_DATABASE_NAME")
+for var in "${required_vars[@]}"; do
+  if [ -z "${!var}" ]; then
+    echo "Error: $var is not set"
+    exit 1
+  fi
+done
+
 # read variables from environment

20-23: Add privilege check before database operations

Consider adding a check for necessary privileges before attempting database operations to prevent script failures.

 if [ "$CI" ]; then
   echo "⛷️ Skipping database creation in CI environment."
 else
+  # Check if user has necessary privileges
+  if ! mysql $param -e "SHOW GRANTS" | grep -q "ALL PRIVILEGES"; then
+    echo "Error: Insufficient privileges for database operations"
+    exit 1
+  fi
+
   # reset database
   echo "️😈 Reset database $db_name in local"

27-27: Add directory existence check

Consider validating the SQL directory exists before attempting to list files.

+# Check if SQL directory exists
+if [ ! -d "sql/mysql" ]; then
+  echo "Error: sql/mysql directory not found"
+  exit 1
+fi
+
 # find all sql files and sort
 sql_files=$(ls sql/mysql/*.sql | sort)
prepare-database-postgresql.sh (2)

45-47: Enhance completion verification.

The current verification step only lists tables but doesn't validate the expected state.

Apply this diff to add more comprehensive verification:

 echo "🎉 prepare database $db_name done"
-# psql $param --dbname=$db_name -c "SELECT * FROM pg_catalog.pg_tables where schemaname = 'public';"
-psql $param --dbname=$db_name -c "\dt"
+echo "Verifying database setup..."
+
+# Check if essential tables exist
+essential_tables=$(psql $param --dbname=$db_name -t -c "\dt" | wc -l)
+if [ "$essential_tables" -eq 0 ]; then
+  echo "Error: No tables found in database"
+  exit 1
+fi
+
+# Display detailed schema information
+psql $param --dbname=$db_name -c "\dt"
+echo "✅ Database verification complete"

1-47: Add security warning and documentation.

This script handles sensitive information and database operations. Consider adding:

  1. A warning header about production usage
  2. Documentation about required permissions
  3. Security implications of SQL file execution

Add this documentation at the start of the script:

 #!/bin/bash
 
+# WARNING: This script is intended for development/testing environments.
+# For production use:
+# - Ensure proper access controls on SQL files
+# - Use least-privilege database users
+# - Consider using database migration tools
+# - Remove default passwords
+
 set -e
config/database.ts (3)

7-12: Consider adding environment variable validation and using const declarations.

  1. Add validation for required environment variables to fail fast if critical configuration is missing.
  2. Consider using const declarations since these variables are only assigned once.

Example implementation:

const validateEnv = () => {
  const required = ['CNPMCORE_DATABASE_NAME'];
  const missing = required.filter(key => !process.env[key]);
  if (missing.length > 0) {
    throw new Error(`Missing required environment variables: ${missing.join(', ')}`);
  }
};

// Call this at startup
validateEnv();

29-33: Track incomplete SQLite implementation.

The SQLite configuration is incomplete with only basic settings.

Would you like me to:

  1. Create a GitHub issue to track the SQLite implementation?
  2. Help implement the missing SQLite configuration?

Key points to consider for implementation:

  • File path configuration
  • Migration support
  • Connection pooling settings
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 30-33: config/database.ts#L30-L33
Added lines #L30 - L33 were not covered by tests


35-44: Add type safety and secure password handling.

  1. Define an interface for the database configuration.
  2. Consider masking the password in logs/debugging output.

Example implementation:

interface DatabaseConfig {
  type: DATABASE_TYPE;
  dialect: string;
  client: string;
  name?: string;
  host?: string;
  port?: string;
  user?: string;
  password?: string;
}

export const database: DatabaseConfig = {
  ...config,
  toJSON() {
    const clone = { ...this };
    clone.password = '<redacted>';
    return clone;
  },
};
package.json (1)

39-39: Consider improving script organization and test coverage.

While the PostgreSQL scripts are well-structured, consider these improvements:

  1. Extract common database preparation logic to reduce duplication between MySQL and PostgreSQL scripts
  2. Add a combined test script to verify against both databases
 "scripts": {
+  "test:all": "npm run test && npm run test:postgresql",
+  "prepare-database": "bash prepare-database.sh",
   ...
 }

Also applies to: 42-44, 52-53, 55-55

app/repository/TaskRepository.ts (1)

Line range hint 31-44: Good database-agnostic error handling, but consider transaction safety.

The switch to isDuplicateKeyError makes the code more portable across different databases. However, there's a potential race condition in the error handling flow.

Consider wrapping the duplicate key error handling in a transaction to prevent race conditions. When taskModel is not found after a duplicate key error, another process might have deleted the record between your check and update.

Example approach:

if (isDuplicateKeyError(e)) {
  this.logger.warn(e);
  // Start transaction
  const transaction = await this.Task.startTransaction();
  try {
    const taskModel = await this.Task.findOne({ bizId: task.bizId }, { transaction });
    if (taskModel) {
      task.id = taskModel.id;
      task.taskId = taskModel.taskId;
      await transaction.commit();
      return;
    }
    await transaction.rollback();
    throw e;
  } catch (err) {
    await transaction.rollback();
    throw err;
  }
}
DEVELOPER.md (1)

Line range hint 1-294: Consider adding database adapter documentation

While the documentation thoroughly covers both MySQL and PostgreSQL setup, it would be beneficial to document the database adapter architecture that enables this dual database support. This would help future contributors understand how to add support for additional databases.

Consider adding a new section under "架构分层依赖图" (Architecture Layer Dependency Diagram) that explains:

  • Database adapter interface
  • Implementation specifics for MySQL and PostgreSQL
  • Guidelines for adding new database adapters
config/config.default.ts (2)

78-79: Consider explicit database property mapping

While using the spread operator provides flexibility, consider explicitly mapping required database properties to prevent potential conflicts or inclusion of unnecessary properties. This would make the configuration more maintainable and type-safe.

-    ...database,
-    database: database.name ?? 'cnpmcore',
+    client: database.client,
+    host: database.host,
+    port: database.port,
+    user: database.user,
+    password: database.password,
+    database: database.name ?? 'cnpmcore',

85-87: Remove commented-out code

The commented-out error logging code should be removed as it adds noise to the codebase. If this logging is needed for debugging, consider adding it to the documentation instead.

-      // logQueryError(...args: any[]) {
-      //   console.log(args);
-      // },
test/port/controller/TokenController/createToken.test.ts (1)

235-235: LGTM! Error handling updated for cross-database support.

The modification correctly handles unique constraint violation errors for both MySQL and PostgreSQL databases. However, consider extracting these error patterns into constants for better maintainability.

Consider refactoring like this:

+ const DB_DUPLICATE_ENTRY_ERRORS = {
+   MYSQL: 'ER_DUP_ENTRY',
+   POSTGRESQL: 'duplicate key value violates unique constraint'
+ };
+ 
- assert.match(res.body.error, /ER_DUP_ENTRY|duplicate key value violates unique constraint/);
+ assert.match(res.body.error, new RegExp(`${DB_DUPLICATE_ENTRY_ERRORS.MYSQL}|${DB_DUPLICATE_ENTRY_ERRORS.POSTGRESQL}`));
app/repository/PackageRepository.ts (1)

410-428: Consider performance optimization for large tables

The current implementation might be inefficient for large tables:

  1. MySQL: TABLE_ROWS from information_schema might be inaccurate as it's based on statistics
  2. PostgreSQL: COUNT(*) without conditions can be slow on large tables

Consider these alternatives:

  1. Use approximate row count for PostgreSQL: SELECT reltuples::bigint FROM pg_class WHERE relname = $1;
  2. Cache the counts with a reasonable TTL
  3. Use materialized views for frequently accessed counts
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 412-424: app/repository/PackageRepository.ts#L412-L424
Added lines #L412 - L424 were not covered by tests

sql/ddl_mysql.sql (2)

252-268: Consider optimizing version field length and adding foreign key constraints

The version field is defined as varchar(256) which seems excessive for semantic versioning strings. Most semver strings are under 50 characters. Consider reducing the field length to improve index efficiency.

Additionally, there are implicit relationships between tables that should be enforced with foreign key constraints:

  • package_versions.package_idpackages.package_id
  • package_deps.package_version_idpackage_versions.package_version_id

Also applies to: 230-249


316-339: Consider optimizing log and error storage

A few suggestions for the task tables:

  1. The log_store_position field is limited to 10 characters which might be restrictive for future storage locations
  2. The error field uses longtext which could impact performance. Consider:
    • Moving error details to a separate table if they're frequently large
    • Using varchar with a reasonable limit if errors are typically small

Also applies to: 50-68

sql/postgresql/3.68.0.sql (3)

237-275: Consider using a normalized design for download statistics

The current design with 31 columns (d01-d31) for daily downloads:

  1. Makes it difficult to query download trends
  2. Requires complex queries for aggregating statistics
  3. Makes it hard to handle months with different number of days

Consider using a separate table with a date column instead.

Alternative schema:

CREATE TABLE package_downloads (
  id BIGSERIAL PRIMARY KEY,
  gmt_create timestamp(3) NOT NULL,
  gmt_modified timestamp(3) NOT NULL,
  package_id varchar(214) NOT NULL,
  version varchar(256) NOT NULL,
  download_date date NOT NULL,
  download_count integer NOT NULL DEFAULT 0,
  UNIQUE(package_id, version, download_date)
);

406-406: Fix incorrect comment for padding_version

The comment for padding_version incorrectly states "token name" instead of describing its actual purpose.

-COMMENT ON COLUMN package_versions.padding_version IS 'token name';
+COMMENT ON COLUMN package_versions.padding_version IS 'padded version string for correct version sorting';

517-518: Document valid enum values

Tables like tasks use varchar columns for enum-like values (type, state) but lack documentation about valid values.

Add comments documenting valid values:

 COMMENT ON COLUMN tasks.type IS 'task type';
+COMMENT ON COLUMN tasks.type IS 'task type: sync, publish, unpublish, etc.';
 COMMENT ON COLUMN tasks.state IS 'task state';
+COMMENT ON COLUMN tasks.state IS 'task state: pending, processing, success, failure';
sql/ddl_postgresql.sql (3)

651-664: Consider additional security measures for authentication-related tables

While the current security setup is solid, consider these enhancements:

  1. Add CHECK constraints on password_integrity length to ensure proper hash storage
  2. Add an index on tokens.expired_at to optimize token validation queries
  3. Consider adding a last_password_change timestamp to the users table for password rotation policies
 CREATE TABLE users (
   -- existing columns --
+  last_password_change timestamp(3) DEFAULT NULL,
   scopes json DEFAULT NULL
 );

+ALTER TABLE users ADD CONSTRAINT chk_password_integrity_length 
+  CHECK (length(password_integrity) = 128);

+CREATE INDEX tokens_idx_expired_at ON tokens (expired_at);

Also applies to: 574-591, 682-691


237-275: Consider optimizing the download tracking table structure

The current per-day column approach in package_version_downloads might lead to:

  1. Row-level contention during concurrent updates
  2. Inefficient storage due to the wide table structure

Consider these alternatives:

  1. Partition the table by year_month for better performance
  2. Use a normalized structure with daily records
+CREATE TABLE package_version_downloads (
+  id BIGSERIAL PRIMARY KEY,
+  gmt_create timestamp(3) NOT NULL,
+  gmt_modified timestamp(3) NOT NULL,
+  year_month integer NOT NULL,
+  download_date date NOT NULL,
+  package_id varchar(214) NOT NULL,
+  version varchar(256) NOT NULL,
+  download_count integer NOT NULL DEFAULT 0
+) PARTITION BY RANGE (year_month);

+CREATE UNIQUE INDEX package_version_downloads_uk_date_package 
+  ON package_version_downloads (download_date, package_id, version);

Also applies to: 277-280


512-528: Consider adding maintenance strategies for task tables

The task management structure is well-designed, but consider these improvements:

  1. Add a CHECK constraint to validate task states
  2. Implement table partitioning on history_tasks for efficient archival
 CREATE TABLE tasks (
   -- existing columns --
+  CONSTRAINT chk_task_state 
+    CHECK (state IN ('pending', 'processing', 'success', 'error'))
 );

+-- Partition history_tasks by month for better maintenance
+ALTER TABLE history_tasks 
+  PARTITION BY RANGE (EXTRACT(YEAR FROM gmt_create) * 100 + 
+                     EXTRACT(MONTH FROM gmt_create));

Also applies to: 79-94

app/core/service/PackageManagerService.ts (1)

Line range hint 275-279: LGTM! Improved error handling for database independence.

The change from direct error code checking to using isDuplicateKeyError utility:

  • Makes the code database-agnostic, supporting both PostgreSQL and MySQL
  • Improves maintainability by centralizing error handling logic
  • Maintains consistent error messages across different database backends

Consider applying similar error handling abstraction to other database operations in the service for complete database independence.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between dd15b08 and ac0d525.

📒 Files selected for processing (24)
  • .github/workflows/nodejs.yml (1 hunks)
  • DEVELOPER.md (5 hunks)
  • README.md (2 hunks)
  • app/core/service/PackageManagerService.ts (3 hunks)
  • app/core/service/PackageVersionFileService.ts (2 hunks)
  • app/port/config.ts (3 hunks)
  • app/repository/PackageRepository.ts (4 hunks)
  • app/repository/TaskRepository.ts (2 hunks)
  • app/repository/util/ErrorUtil.ts (1 hunks)
  • config/config.default.ts (3 hunks)
  • config/config.unittest.ts (2 hunks)
  • config/database.ts (1 hunks)
  • docs/setup-with-postgresql.md (1 hunks)
  • docs/setup.md (1 hunks)
  • package.json (4 hunks)
  • prepare-database-mysql.sh (3 hunks)
  • prepare-database-postgresql.sh (1 hunks)
  • sql/ddl_mysql.sql (1 hunks)
  • sql/ddl_postgresql.sql (1 hunks)
  • sql/postgresql/3.68.0.sql (1 hunks)
  • test/TestUtil.ts (3 hunks)
  • test/port/controller/PackageVersionFileController/listFiles.test.ts (4 hunks)
  • test/port/controller/TokenController/createToken.test.ts (1 hunks)
  • test/repository/ChangeRepository.test.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • README.md
🧰 Additional context used
🪛 GitHub Check: codecov/patch
app/repository/util/ErrorUtil.ts

[warning] 3-4: app/repository/util/ErrorUtil.ts#L3-L4
Added lines #L3 - L4 were not covered by tests

app/repository/PackageRepository.ts

[warning] 412-424: app/repository/PackageRepository.ts#L412-L424
Added lines #L412 - L424 were not covered by tests

config/database.ts

[warning] 16-21: config/database.ts#L16-L21
Added lines #L16 - L21 were not covered by tests


[warning] 30-33: config/database.ts#L30-L33
Added lines #L30 - L33 were not covered by tests

🪛 actionlint (1.7.4)
.github/workflows/nodejs.yml

49-49: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)


52-52: the runner of "actions/setup-node@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)


70-70: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 Biome (1.9.4)
test/TestUtil.ts

[error] 67-67: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 83-83: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 84-84: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 88-88: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 106-106: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 107-107: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 111-111: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 111-111: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 116-116: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 117-117: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 121-121: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


[error] 122-123: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)

🪛 Markdownlint (0.35.0)
docs/setup-with-postgresql.md

16-16: null
Bare URL used

(MD034, no-bare-urls)

🔇 Additional comments (46)
test/TestUtil.ts (5)

8-8: Importing pg module for PostgreSQL support.

The addition of import pg from 'pg'; is appropriate for enabling PostgreSQL functionality.


14-14: Importing database configuration and types.

The import of { database, DATABASE_TYPE } aligns with the need to handle multiple database types.


49-55: Refactored getDatabaseConfig method to support multiple databases.

Renaming getMySqlConfig to getDatabaseConfig and updating its implementation enhances flexibility for handling different database configurations.


84-84: Ensure compliance with security best practices for CI environments.

When running in a continuous integration (CI) environment, be cautious about logging configurations. Ensure that sensitive information is masked or omitted.

Run the following script to check for any logs that may output sensitive information:

#!/bin/bash
# Description: Search for console.log statements that may log sensitive data.

# Find all console.log statements in the codebase.
rg 'console\.log\(.+\);' -g '*.ts' -A 2

# Verify that no sensitive information is being logged.
🧰 Tools
🪛 Biome (1.9.4)

[error] 84-84: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)


84-84: ⚠️ Potential issue

Avoid logging sensitive information in the console.

Logging the database configuration may expose sensitive data such as passwords. Consider removing or sanitizing any sensitive information from log statements.

🧰 Tools
🪛 Biome (1.9.4)

[error] 84-84: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)

docs/setup.md (1)

Line range hint 1-35: Verify database initialization scripts

Let's verify if the database initialization scripts exist in the project.

✅ Verification successful

Database initialization setup is properly configured

The setup documentation accurately reflects the project's database configuration:

  • Docker setup is properly configured with docker-compose.yml that includes MySQL, Redis, and phpMyAdmin services
  • SQL initialization files are present in the repository:
    • Main DDL files: sql/ddl_mysql.sql and sql/ddl_postgresql.sql
    • Version-specific migration files in sql/mysql/ and sql/postgresql/
  • The manual setup instructions using Homebrew for macOS are valid alternatives
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the database initialization scripts are defined in package.json

# Check package.json for database initialization scripts
echo "Checking package.json for database scripts..."
rg '"db:init:(mysql|postgres|sqlite)"' -A 1

Length of output: 146


Script:

#!/bin/bash
# Let's check for any database-related scripts in package.json and docker-compose.yml existence
echo "Checking package.json for any database-related scripts..."
rg '"(db|database|mysql|redis)"' package.json

echo -e "\nChecking for docker-compose.yml..."
fd -t f "docker-compose.yml"

echo -e "\nChecking for any SQL initialization scripts..."
fd -e sql

Length of output: 703


Script:

#!/bin/bash
# Let's check the content of docker-compose.yml and the main DDL file
echo "Checking docker-compose.yml content..."
cat docker-compose.yml

echo -e "\nChecking main MySQL DDL content..."
head -n 20 sql/ddl_mysql.sql

Length of output: 2600

config/config.unittest.ts (2)

4-4: LGTM! Clean import of database configuration.

The import follows TypeScript best practices and properly separates database configuration concerns.


13-13: Verify the database module implementation.

The change looks good and aligns with PostgreSQL support objectives. The use of the nullish coalescing operator ensures a safe fallback to 'cnpmcore_unittest'.

Let's verify the database module implementation:

✅ Verification successful

Let me verify the database module implementation by checking for any potential issues or inconsistencies.


Database module implementation is correctly configured and safe

The implementation is well-structured and secure:

  • Database type handling is properly implemented with clear enum definitions
  • Default fallback to 'cnpmcore_unittest' is consistently used across test configurations
  • PostgreSQL support is properly configured with correct client and dialect settings
  • Environment variable fallbacks are implemented safely using nullish coalescing operators
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the database configuration module structure and usage

# Check the database configuration module
echo "Checking database configuration module:"
cat config/database.ts || echo "database.ts not found in config directory"

# Look for other potential database name references that might need updating
echo -e "\nChecking for other database name references:"
rg -g '!*.{log,md,lock}' -i "cnpmcore.*unittest" .

# Verify test configurations for both PostgreSQL and MySQL
echo -e "\nChecking for database-specific test configurations:"
rg -g '*.{ts,js,json}' -i "(mysql|postgres|postgresql).*test"

Length of output: 137459


Script:

#!/bin/bash
# Check database configuration and related files
echo "Checking database configuration module:"
cat config/database.ts

echo -e "\nChecking test database configurations:"
rg -g '*.{ts,js}' -A 5 'cnpmcore_unittest'

echo -e "\nChecking database type handling:"
ast-grep --pattern 'DATABASE_TYPE = {
  $$$
}'

Length of output: 2778

prepare-database-mysql.sh (3)

36-37: LGTM! Clear completion feedback

The completion message with table listing provides good verification of successful execution.


Line range hint 1-37: Verify corresponding PostgreSQL script exists

Since this PR adds PostgreSQL support, ensure there's a corresponding prepare-database-postgresql.sh script with similar functionality.

✅ Verification successful

PostgreSQL preparation script exists with appropriate database-specific adaptations

The verification confirms that a corresponding PostgreSQL script exists with proper database-specific changes:

  • Uses PostgreSQL-specific environment variables (POSTGRES_HOST, POSTGRES_PORT, etc.)
  • Implements PostgreSQL-specific commands (dropdb, createdb, psql)
  • Correctly references PostgreSQL-specific SQL files from sql/postgresql/ directory
  • Includes appropriate PostgreSQL connection parameter handling
  • Contains equivalent functionality for database creation and schema setup
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for PostgreSQL preparation script
if [ -f "prepare-database-postgresql.sh" ]; then
  echo "PostgreSQL script exists"
  diff -u prepare-database-mysql.sh prepare-database-postgresql.sh || echo "Scripts differ as expected"
else
  echo "Warning: PostgreSQL preparation script not found"
fi

Length of output: 2562


7-7: Review security implications of empty password default

An empty default password could pose a security risk if accidentally deployed to production.

✅ Verification successful

Let me gather more context about the database configuration and usage patterns.


Let me check the database configuration and setup files to understand the context better.


Empty database password default is acceptable for local development

The empty password default in prepare-database-mysql.sh is specifically designed for local development environments, as evidenced by:

  • The script is a development setup tool, documented in DEVELOPER.md for local environment initialization
  • Production configurations are handled separately through environment variables (CNPMCORE_DATABASE_PASSWORD)
  • The script contains safety checks and clear documentation for development purposes
  • Docker compose is recommended for local development, which sets its own secure credentials
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if password is required in other environments
rg -l "CNPMCORE_DATABASE_PASSWORD" | xargs grep -l "required" || echo "No password requirement found in configuration files"

Length of output: 174


Script:

#!/bin/bash
# Check for database configuration patterns and documentation
echo "=== Checking database configuration files ==="
rg -l "CNPMCORE_DATABASE" --type-add 'config:*.{json,yaml,yml,env}' -g '!{dist,build}/*'

echo -e "\n=== Checking README and documentation ==="
fd "README" --exec cat {} \; || true
fd "CONTRIBUTING" --exec cat {} \; || true

echo -e "\n=== Checking for environment setup examples ==="
fd ".env" --exec cat {} \; || true

echo -e "\n=== Checking for database connection handling ==="
rg -A 5 "createConnection|createPool" --type ts

Length of output: 2432


Script:

#!/bin/bash
# Check database configuration and setup files
echo "=== Checking DEVELOPER.md ==="
cat DEVELOPER.md

echo -e "\n=== Checking database configuration ==="
cat config/database.ts

echo -e "\n=== Checking database setup scripts ==="
cat prepare-database-mysql.sh
cat prepare-database-postgresql.sh

Length of output: 12006

test/repository/ChangeRepository.test.ts (2)

33-34: LGTM! Good database-agnostic approach.

The dynamic calculation of firstId makes the test more robust by accommodating both MySQL (where IDs reset after truncation) and PostgreSQL (where they don't).


36-36: Verify test coverage with both databases.

The assertion correctly uses relative ID offsets, making it database-agnostic. However, we should ensure this test runs against both MySQL and PostgreSQL.

✅ Verification successful

Tests are properly configured for both MySQL and PostgreSQL databases

The GitHub Actions workflow shows comprehensive test coverage:

  • PostgreSQL tests run via the test-postgresql-fs-nfs job using npm run ci:postgresql
  • MySQL tests run via both test-mysql57-fs-nfs and test-mysql57-oss-nfs jobs using npm run ci
  • The database type is properly configured through the CNPMCORE_DATABASE_TYPE environment variable
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that tests are configured to run against both databases
# Expected: Find GitHub Actions workflow configurations that run tests with both MySQL and PostgreSQL

# Check for database-specific test configurations in GitHub Actions
fd -e yml -e yaml . .github/workflows/ --exec grep -l -E "mysql|postgresql|postgres" {}

Length of output: 120


Script:

#!/bin/bash
# Let's check the content of the workflow file to see how databases are configured
cat .github/workflows/nodejs.yml

Length of output: 5114


Script:

#!/bin/bash
# Let's check the npm scripts to understand how tests are run for different databases
cat package.json | jq -r '.scripts | .["ci:postgresql"], .ci'

Length of output: 170


Script:

#!/bin/bash
# Let's check the coverage scripts to see how tests are executed
cat package.json | jq -r '.scripts | .["cov:postgresql"], .cov'

Length of output: 123

prepare-database-postgresql.sh (1)

12-26: LGTM! Parameter handling follows best practices.

The code correctly builds PostgreSQL parameters and follows the recommended approach for password handling using PGPASSWORD environment variable.

config/database.ts (3)

1-5: LGTM! Well-structured enum definition.

The enum clearly defines the supported database types, aligning with the PR objectives.


15-21: ⚠️ Potential issue

Security: Remove hardcoded root credential.

The hardcoded "root" username is a security concern as flagged by the security scanner.

Suggested fix:

- dbUser = dbUser ?? process.env.CNPMCORE_MYSQL_USER ?? process.env.MYSQL_USER ?? 'root';
+ dbUser = dbUser ?? process.env.CNPMCORE_MYSQL_USER ?? process.env.MYSQL_USER;
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 16-21: config/database.ts#L16-L21
Added lines #L16 - L21 were not covered by tests


15-21: Add test coverage for MySQL configuration.

The MySQL configuration block lacks test coverage according to codecov.

Let's verify existing tests:

Would you like me to help generate test cases for the MySQL configuration scenarios?

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 16-21: config/database.ts#L16-L21
Added lines #L16 - L21 were not covered by tests

package.json (2)

4-4: LGTM! Description accurately reflects the project's enterprise focus.

The updated description better represents the project's scope as an enterprise-grade private NPM registry.


112-112: Verify PostgreSQL client library versions and security.

  1. Version consistency:

    • The pg package version (8.13.1) is older than its type definitions (@types/pg@8.11.10)
    • Consider aligning these versions or updating to the latest stable releases
  2. Security verification needed for the PostgreSQL client library.

Also applies to: 137-137

app/repository/TaskRepository.ts (3)

2-2: LGTM! Good abstraction of database-specific error handling.

The new imports support a database-agnostic approach, particularly the isDuplicateKeyError utility which will help handle unique constraint violations consistently across MySQL and PostgreSQL.

Also applies to: 5-5


Line range hint 1-150: Verify PostgreSQL compatibility of model operations.

The repository uses proper abstractions through models and generic error handling, which should work well with PostgreSQL. However, let's verify that all model operations are compatible.

✅ Verification successful

Let me analyze the results and generate the final response.

PostgreSQL compatibility verified for repository operations.

The code review shows proper abstraction through models and generic error handling. The repository uses standard MongoDB/Sequelize-style query operators without any MySQL-specific syntax in the queries. Key observations:

  • No MySQL-specific operators or syntax in the queries
  • Uses standard model operations (findOne, find, update)
  • Generic error handling with isDuplicateKeyError
  • No direct SQL queries or database-specific features
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any MySQL-specific operations or syntax
rg -i "mysql|myisam|innodb" 

# Check for any database-specific operators in queries
ast-grep --pattern 'find({ $$$: { $regex: $_ } })'

Length of output: 131028


Line range hint 71-82: Good version deduplication, but consider input validation.

The use of uniq ensures efficient version handling without duplicates. However, there are some potential improvements for robustness.

Let's verify the maximum allowed length for the versions array in the database schema:

Consider adding input validation:

if (specificVersions) {
  // Validate input array
  if (!Array.isArray(specificVersions) || 
      !specificVersions.every(v => typeof v === 'string' && v.length > 0)) {
    throw new Error('Invalid specificVersions format');
  }
  
  const data = model.data;
  const combinedVersions = uniq<string>(data.specificVersions.concat(specificVersions));
  
  // Check maximum length (adjust limit as needed)
  if (combinedVersions.length > MAX_VERSIONS_LENGTH) {
    throw new Error('Combined versions exceed maximum allowed length');
  }
  
  data.specificVersions = combinedVersions;
  await model.update({ data });
}
.github/workflows/nodejs.yml (2)

41-45: LGTM! Comprehensive Node.js version matrix

The matrix configuration is well-structured with:

  • Specific LTS version (18.20.0)
  • Latest major versions (18, 20, 22)
  • Fail-fast disabled for better debugging

59-67: 🛠️ Refactor suggestion

Secure PostgreSQL credentials and verify CI script

  1. Database credentials should use GitHub secrets
  2. Need to verify the existence of the ci:postgresql npm script

Apply this diff to use secrets:

      env:
        POSTGRES_HOST: localhost
-       POSTGRES_USER: postgres
-       POSTGRES_PASSWORD: postgres
+       POSTGRES_USER: ${{ secrets.POSTGRES_USER }}
+       POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}
        POSTGRES_PORT: 5432
app/port/config.ts (3)

2-2: LGTM!

The import statement for DATABASE_TYPE is appropriately placed and necessary for the new database configuration.


98-98: LGTM!

The type formatting change follows TypeScript conventions more closely.


180-185: 🛠️ Refactor suggestion

Consider enhancing the database configuration structure.

The current database configuration appears too minimal for proper database connectivity. Consider:

  1. Adding essential database connection properties
  2. Improving type safety by removing the string union type
  3. Adding JSDoc documentation like other properties

Here's a suggested enhancement:

  /**
   * database config
+  * @property {DATABASE_TYPE} type - The type of database (e.g., postgresql, mysql)
+  * @property {string} host - Database host
+  * @property {number} port - Database port
+  * @property {string} username - Database user
+  * @property {string} password - Database password
+  * @property {string} database - Database name
+  * @property {object} [options] - Additional database-specific options
   */
   database: {
-    type: DATABASE_TYPE | string,
+    type: DATABASE_TYPE,
+    host: string,
+    port: number,
+    username: string,
+    password: string,
+    database: string,
+    options?: Record<string, unknown>,
   },

Let's verify if there are any existing database configurations in use:

DEVELOPER.md (4)

5-5: LGTM: Clear documentation of database options

The documentation clearly states the database options and provides appropriate setup links for both MySQL and PostgreSQL environments.

Also applies to: 17-17


40-51: Verify referenced PostgreSQL scripts exist

The documentation references new scripts. Let's verify their existence:

  • prepare-database-postgresql.sh
  • dev:postgresql npm script
✅ Verification successful

PostgreSQL setup scripts are properly configured

Both required components are present in the codebase:

  • prepare-database-postgresql.sh script exists in the root directory
  • dev:postgresql npm script is defined in package.json with the correct database type configuration
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the PostgreSQL preparation script exists
if [ -f "./prepare-database-postgresql.sh" ]; then
    echo "PostgreSQL preparation script exists"
else
    echo "Missing prepare-database-postgresql.sh"
fi

# Check if the dev:postgresql script is defined in package.json
rg '"dev:postgresql"' package.json

Length of output: 236


55-65: Verify test script configuration

Let's verify the test scripts are properly configured in package.json.

✅ Verification successful

Test scripts are properly configured

The package.json contains all the necessary test scripts with correct configurations:

  • test and test:postgresql both include linting
  • test:local runs egg-bin test for MySQL (default)
  • test:local:postgresql properly sets the database type environment variable
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if both test scripts are defined in package.json
rg '"test":|"test:postgresql":' package.json

Length of output: 176


Script:

#!/bin/bash
# Let's verify the test:local and test:local:postgresql scripts as well
rg '"test:local":|"test:local:postgresql":' package.json

Length of output: 171


27-31: Verify environment variable consistency

The MySQL setup now uses CNPMCORE_DATABASE_NAME. Let's verify this change is consistent across the codebase.

✅ Verification successful

Let me check the actual content of these scripts to see how DATABASE_NAME is being used.


Let me check the application code to see if there are any environment variable references that need to be consistent.


Let me try a different approach to search through all potential source files.


Environment variable naming is consistent

The environment variable CNPMCORE_DATABASE_NAME is consistently used across the codebase:

  • Used in both database preparation scripts (prepare-database-mysql.sh and prepare-database-postgresql.sh)
  • Referenced in the application configuration (config/database.ts)
  • Correctly documented in DEVELOPER.md
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining occurrences of old database name variables
rg -l "DATABASE_NAME" --type-add 'script:*.{sh,bash}' -t script

# Check if the new variable is consistently used
rg -l "CNPMCORE_DATABASE_NAME" --type-add 'script:*.{sh,bash}' -t script

Length of output: 250


Script:

#!/bin/bash
# Check the content of database preparation scripts
echo "=== MySQL script ==="
cat prepare-database-mysql.sh
echo -e "\n=== PostgreSQL script ==="
cat prepare-database-postgresql.sh

Length of output: 2601


Script:

#!/bin/bash
# Search for database name environment variable references in the codebase
rg -l "DATABASE_NAME" --type typescript --type javascript

# Also check for the new variable name
rg -l "CNPMCORE_DATABASE_NAME" --type typescript --type javascript

Length of output: 194


Script:

#!/bin/bash
# Find all potential source files and search for database name variables
echo "=== Searching in all potential source files ==="
fd -e ts -e js -e tsx -e jsx | xargs rg "DATABASE_NAME"

echo -e "\n=== Searching for new variable name ==="
fd -e ts -e js -e tsx -e jsx | xargs rg "CNPMCORE_DATABASE_NAME"

Length of output: 456

config/config.default.ts (2)

8-9: LGTM! Good use of type-only imports

The change to use type-only import for CnpmcoreConfig is a TypeScript best practice that helps reduce bundle size. The addition of the database import aligns well with the modular configuration approach needed for multi-database support.


64-66: Verify database type validation

The database type configuration looks good and supports the multi-database objective. However, let's verify that the database type is properly validated.

✅ Verification successful

Database type validation is properly implemented

The database type configuration is well-validated through an enum-based type system:

  • Database types are strictly defined in DATABASE_TYPE enum in config/database.ts
  • The system supports MySQL and PostgreSQL as valid database types
  • Type validation is enforced through TypeScript's type system and enum usage
  • The configuration is properly consumed across the codebase with type checks (e.g., in PackageRepository.ts, TestUtil.ts)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if database type validation exists and supported types are documented

# Look for database type validation
rg -A 5 "database.type"

# Check for database type documentation
fd -t f "README|CONTRIBUTING" -X grep -l "database.*type"

Length of output: 2044


Script:

#!/bin/bash
# Let's check for database type definitions and validations
# First, find the DATABASE_TYPE enum or constant
ast-grep --pattern 'DATABASE_TYPE = {
  $$$
}'

# Also search for any database type validation logic
rg -B2 -A2 "DATABASE_TYPE\."

# Look for database configuration types or interfaces
ast-grep --pattern 'interface $_ {
  $$$
  type: $_
  $$$
}'

Length of output: 6862


Script:

#!/bin/bash
# Let's find the DATABASE_TYPE definition since we see it's being used
rg -B2 -A2 "export (enum|const) DATABASE_TYPE"

# Also check for any database configuration validation code
ast-grep --pattern 'function $_(database: $_) {
  $$$
}'

# Look for any database type validation in config files
rg -B2 -A2 "validateDatabaseType|validateDatabase"

Length of output: 293

test/port/controller/TokenController/createToken.test.ts (1)

232-238: Consider adding database-specific test cases.

While the error handling is updated for PostgreSQL support, consider adding specific test cases to verify database-specific behaviors:

  1. Connection handling
  2. Transaction rollback behavior
  3. Database-specific query syntax differences

Let's check if there are any database-specific test files:

Consider organizing database-specific tests in separate files or using conditional tests based on the database type. This would help maintain clear separation of concerns and ensure comprehensive coverage for both MySQL and PostgreSQL.

app/core/service/PackageVersionFileService.ts (2)

24-24: LGTM: Clean import addition for database-agnostic error handling

The addition of isDuplicateKeyError utility import supports the transition to database-agnostic error handling, which is essential for PostgreSQL support.


276-278: Verify database-agnostic duplicate key detection

The error handling has been improved to use a database-agnostic approach. Let's verify the implementation of isDuplicateKeyError to ensure it correctly handles both MySQL and PostgreSQL cases.

✅ Verification successful

Based on the output, I can see the implementation of isDuplicateKeyError in app/repository/util/ErrorUtil.ts. The function properly handles both MySQL (ER_DUP_ENTRY) and PostgreSQL (duplicate key constraint violation) cases:

export function isDuplicateKeyError(err: any) {
  if (err.code === 'ER_DUP_ENTRY') {
    return true;
  }
  if (err.message.includes('duplicate key value violates unique constraint')) {
    // pg: duplicate key value violates unique constraint "tasks_uk_task_id"
    // code: '23505'
    return true;
  }
}

Let's verify if there are any test cases to ensure this functionality:


Database-agnostic duplicate key detection is properly implemented

The implementation in ErrorUtil.ts correctly handles duplicate key errors for both MySQL (ER_DUP_ENTRY) and PostgreSQL (duplicate key value violates unique constraint). This is also validated by the test case in test/port/controller/TokenController/createToken.test.ts which checks for both error patterns. The error handling in PackageVersionFileService.ts is using this utility appropriately.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the implementation of isDuplicateKeyError utility
# Expected: Function should handle both MySQL and PostgreSQL duplicate key errors

# Search for the implementation and test cases
echo "Searching for isDuplicateKeyError implementation and tests..."
rg -A 10 "isDuplicateKeyError" 

# Look for related test files
echo "Searching for related test files..."
fd -e test.ts -e spec.ts | xargs rg "isDuplicateKeyError"

Length of output: 6104


Script:

#!/bin/bash
# Search for test files specifically related to ErrorUtil
fd ErrorUtil.*test -e ts -e js

# Search for any tests that might be testing this functionality
rg -B 3 -A 3 "duplicate key" test/

Length of output: 630

app/repository/PackageRepository.ts (2)

21-21: LGTM!

The import of DATABASE_TYPE is necessary for database-specific logic and follows TypeScript conventions.


443-443: LGTM!

The changes to use getTotalCountByModel are consistent with the new implementation and maintain the existing functionality.

Also applies to: 452-452

sql/ddl_mysql.sql (3)

402-418: Well-designed user authentication system

The user authentication system demonstrates good security practices:

  • Password storage with salt and integrity fields
  • Token system with CIDR whitelist restrictions
  • Support for WebAuthn credentials
  • Proper field lengths for cryptographic values

Also applies to: 356-379, 421-434


1-17: Well-structured file storage system

The binary and distribution tables demonstrate good design:

  • Efficient path-based lookups with compound indexes
  • Support for both files and directories
  • Multiple integrity check methods (shasum and integrity fields)

Also applies to: 34-47


1-434: Overall well-designed schema for a package registry

The schema demonstrates good design practices:

  • Consistent naming and structure
  • Appropriate use of indexes and constraints
  • Good balance between normalization and performance
  • Comprehensive tracking of metadata and relationships

The minor suggestions above aside, this is a solid foundation for a package registry system.

sql/ddl_postgresql.sql (1)

1-12: Well-structured base table definitions with appropriate data types

The schema demonstrates good practices:

  • Use of BIGSERIAL for auto-incrementing primary keys ensures sufficient range for high-volume data
  • Consistent timestamp(3) precision across all temporal fields
  • Standard tracking fields (gmt_create, gmt_modified) on all tables

Also applies to: 31-39, 53-63

test/port/controller/PackageVersionFileController/listFiles.test.ts (3)

7-7: LGTM: Import statement for database configuration.

The added import is necessary to support the conditional test assertions based on database type.


334-336: Similar simplified assertions in subdirectory test.

The same pattern of simplified assertions appears in the /id directory test case.

Consider applying the same enhanced verification suggested above to maintain consistency in test coverage across all directory levels.


188-190: 🛠️ Refactor suggestion

Verify the simplified PostgreSQL assertions.

The test only verifies the number of files (20) for PostgreSQL but skips detailed file structure verification. This could miss important metadata differences or ordering issues.

Consider running this script to verify file listing behavior consistency:

Consider enhancing the test coverage for PostgreSQL by:

  1. Verifying file metadata structure remains consistent
  2. Ensuring file ordering is deterministic
  3. Adding explicit test cases for PostgreSQL-specific edge cases

Example:

if (database.type === DATABASE_TYPE.PostgreSQL) {
  assert.equal(res.body.files.length, 20);
  // Verify essential metadata properties exist
  const file = res.body.files[0];
  assert(file.path, 'File path should exist');
  assert(file.type, 'File type should exist');
  assert(file.contentType, 'Content type should exist');
  // Verify ordering is deterministic
  const paths = res.body.files.map(f => f.path);
  assert.deepEqual(paths, [...paths].sort(), 'Files should be sorted');
}
✅ Verification successful

Let me gather more information about the file listing implementation to understand if there are any database-specific differences.


Let me try a different approach to find the implementation details.


Let me check the full implementation of the controller and any related services.


Based on the code review, I can now provide a final response:

The simplified PostgreSQL assertions are appropriate and not a concern.

The different assertions between PostgreSQL and other databases in the test are justified because:

  1. The implementation in PackageVersionFileService and PackageVersionFileController shows that the file listing logic is database-agnostic. The core functionality of listing files is handled through the same service layer regardless of the database type.

  2. The test's focus on file count (20 for root, 10 for /id directory) is sufficient because:

    • File metadata structure is enforced at the entity level through PackageVersionFile
    • File ordering is handled consistently in the service layer
    • The actual file content and structure verification is covered by the non-PostgreSQL test cases
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for differences in file listing implementation between database types
# Look for database-specific handling in the file listing logic

# Search for file listing implementation
ast-grep --pattern 'async function listFiles($$$) {
  $$$
}'

# Look for database-specific queries or handling
rg -A 5 'DATABASE_TYPE.PostgreSQL'

Length of output: 3715


Script:

#!/bin/bash
# Search for file listing implementation in controllers and services
ast-grep --pattern 'class PackageVersionFileController {
  $$$
  async listFiles($$$) {
    $$$
  }
  $$$
}'

# Look for any database-specific queries or handling in file listing
rg -B 2 -A 5 'listFiles.*files.*' --type ts

Length of output: 156


Script:

#!/bin/bash
# Search for file listing implementation with broader patterns
rg -B 3 -A 10 'async listFiles' --type ts

# Look for any database-specific file operations
rg -B 2 -A 5 'files.*PostgreSQL|PostgreSQL.*files' --type ts

# Find the controller file directly
fd -t f "PackageVersionFileController.ts"

Length of output: 1609


Script:

#!/bin/bash
# Get the full controller implementation
cat app/port/controller/PackageVersionFileController.ts

# Look for any file listing service implementations
fd -t f "FileService.ts" -x cat {}

# Search for database-specific file queries
rg -B 3 -A 10 'SELECT.*files|FROM.*files' --type ts

Length of output: 22629

app/core/service/PackageManagerService.ts (1)

13-13: LGTM! Import changes improve code maintainability.

The addition of pMap and isDuplicateKeyError imports enhances the codebase by:

  • Supporting concurrent dependency version checking
  • Abstracting error handling logic into a dedicated utility

Also applies to: 27-27

test/TestUtil.ts Show resolved Hide resolved
test/TestUtil.ts Show resolved Hide resolved
test/TestUtil.ts Show resolved Hide resolved
app/repository/util/ErrorUtil.ts Show resolved Hide resolved
app/repository/util/ErrorUtil.ts Show resolved Hide resolved
sql/postgresql/3.68.0.sql Show resolved Hide resolved
sql/postgresql/3.68.0.sql Outdated Show resolved Hide resolved
sql/postgresql/3.68.0.sql Show resolved Hide resolved
sql/postgresql/3.68.0.sql Show resolved Hide resolved
sql/ddl_postgresql.sql Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (7)
Dockerfile (1)

Line range hint 7-9: Consider build process improvements

Several suggestions to improve the build process:

  1. Use multi-stage builds to reduce final image size
  2. Consider using npm ci instead of npminstall for more deterministic builds
  3. The Chinese npm mirror might cause availability issues for deployments in other regions

Consider this improved structure:

-RUN npm install -g npminstall --registry=https://registry.npmmirror.com \
-  && npminstall -c \
-  && npm run tsc
+COPY package*.json ./
+RUN npm ci
+COPY . .
+RUN npm run tsc
docs/deploy-in-docker.md (3)

149-185: Consider using Docker Compose for better maintainability

The Docker run commands for MySQL and PostgreSQL are nearly identical, with only the database type and port being different. Consider adding a docker-compose.yml example that supports both database types to:

  • Reduce documentation duplication
  • Make it easier to manage service dependencies
  • Provide a better local development experience

Example structure:

version: '3.8'
services:
  app:
    build: .
    environment:
      CNPMCORE_DATABASE_TYPE: ${DB_TYPE:-MySQL}
      CNPMCORE_DATABASE_NAME: cnpmcore
      # ... other env vars
    ports:
      - "7001:7001"
    depends_on:
      - db
      - redis

  db:
    image: ${DB_TYPE:-mysql}:latest
    # ... database config

  redis:
    image: redis:latest
    # ... redis config

205-205: Consider security implications of exposed credentials

While these appear to be demo credentials, it's recommended to:

  1. Add a clear warning that these are for demonstration purposes only
  2. Encourage users to change the admin password immediately after deployment
  3. Consider using placeholder text instead of actual credentials

Line range hint 16-185: Enhance documentation with troubleshooting and migration guides

Consider adding the following sections to improve user experience:

  1. Troubleshooting guide for common database connection issues
  2. Migration guide for users switching between MySQL and PostgreSQL
  3. Performance tuning recommendations for each database type
docs/elasticsearch-setup.md (3)

1-4: Improve documentation structure and consistency

The documentation mixes Chinese and English content, which may cause confusion. Consider:

  1. Maintaining consistent language throughout
  2. Adding clear section headers
  3. Organizing installation steps in a numbered list

Line range hint 4-24: Add configuration best practices and security warnings

The Elasticsearch configuration section should:

  1. Include warnings about security implications
  2. Explain memory limit considerations
  3. Document recommended production settings

Consider adding a separate section for production deployment with security best practices and performance tuning guidelines.


Line range hint 984-996: Enhance command examples and error handling

The data operation examples could be improved:

  1. Add error handling examples
  2. Include more search query examples
  3. Document response formats

Consider adding:

# Example with error handling
curl -X PUT https://r.cnpmjs.org/-/v1/search/sync/${pkgName} \
  -H "Content-Type: application/json" \
  --fail \
  || echo "Failed to sync package"

# Example with advanced search
npm search colors --registry=http://localhost:7001 \
  --searchlimit=10 \
  --parseable \
  --long
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ac0d525 and 7ee2846.

📒 Files selected for processing (3)
  • Dockerfile (1 hunks)
  • docs/deploy-in-docker.md (3 hunks)
  • docs/elasticsearch-setup.md (4 hunks)
🔇 Additional comments (1)
docs/deploy-in-docker.md (1)

16-32: 🛠️ Refactor suggestion

Verify removal of old MySQL-specific environment variables

The documentation has been updated with new generic database environment variables, but we should verify that old MySQL-specific variables (e.g., CNPMCORE_MYSQL_*) are not referenced elsewhere in the codebase.

Add database initialization steps

Consider adding instructions for initializing the database schema for both MySQL and PostgreSQL. This would help users set up their databases correctly.

✅ Verification successful

Based on the search results, I can now provide a final response:

Old MySQL-specific environment variables have been properly handled

The codebase has been properly updated to handle both MySQL and PostgreSQL configurations. The database configuration in config/database.ts shows proper fallback handling:

  • New generic CNPMCORE_DATABASE_* variables are used first
  • Falls back to old CNPMCORE_MYSQL_* variables
  • Finally falls back to plain MYSQL_* variables

The documentation accurately reflects these changes by showing configuration examples for both MySQL and PostgreSQL. No cleanup of old variables is needed as they are properly handled in the fallback chain.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any remaining references to old MySQL-specific environment variables
rg "CNPMCORE_MYSQL_" --type-add 'config:*.{js,json,yaml,yml}' -t config

Length of output: 71


Script:

#!/bin/bash
# Let's do a broader search for MySQL-related configuration in the codebase
rg -i "mysql" --type=js --type=json --type=yaml --type=yml -l

# Also search for any environment variable prefixes related to database
rg "CNPMCORE.*DATABASE|CNPMCORE.*MYSQL|CNPMCORE.*POSTGRESQL" --type=js --type=json --type=yaml --type=yml

# Check config files specifically
fd -e js -e json -e yaml -e yml -e md "config|conf" --exec cat {}

Length of output: 918


Script:

#!/bin/bash
# Let's try a more specific search with ripgrep, fixing the file type flags
rg -i "mysql" -t js -t json -t yaml

# Search for database-related environment variables in the codebase
rg "CNPMCORE_DATABASE_" -t js -t json -t yaml

# Look specifically in the config directory and any configuration files
fd "config" -t d -d 2
fd "config" -t f -e js -e ts -e json -e yaml --exec cat {}

Length of output: 132340


Script:

#!/bin/bash
# Let's search for any database-related configuration or environment variables in the codebase
rg -i "CNPMCORE.*DATABASE|CNPMCORE.*MYSQL|CNPMCORE.*POSTGRESQL" -t js -t ts -t md

# Also search for database initialization or schema setup files
fd -e sql -e prisma -e schema

# Look for any database configuration in config files
fd "config" -t f -e js -e ts | xargs cat

Length of output: 17254

Dockerfile Outdated Show resolved Hide resolved
docs/elasticsearch-setup.md Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 7ee2846 and 6871ad0.

📒 Files selected for processing (5)
  • .github/workflows/nodejs.yml (3 hunks)
  • Dockerfile (1 hunks)
  • prepare-database-postgresql.sh (1 hunks)
  • sql/ddl_postgresql.sql (1 hunks)
  • sql/postgresql/3.68.0.sql (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • Dockerfile
  • prepare-database-postgresql.sh
  • sql/ddl_postgresql.sql
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/nodejs.yml

70-70: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🔇 Additional comments (9)
.github/workflows/nodejs.yml (4)

41-68: LGTM! Well-configured CI environment

The environment configuration is comprehensive with:

  • Good Node.js version matrix coverage
  • Properly configured PostgreSQL connection parameters

69-72: 🛠️ Refactor suggestion

Update Codecov action to latest version

The Codecov action version is outdated.

Apply this diff:

-      uses: codecov/codecov-action@v3
+      uses: codecov/codecov-action@v4

Likely invalid or redundant comment.

🧰 Tools
🪛 actionlint (1.7.4)

70-70: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)


34-39: 🛠️ Refactor suggestion

Pin Redis version for consistent CI runs

For consistency across environments, specify an exact Redis version.

Apply this diff:

-        image: redis
+        image: redis:7.2

Likely invalid or redundant comment.


19-33: ⚠️ Potential issue

Pin PostgreSQL version and use secrets for credentials

Two important improvements needed:

  1. Use a specific PostgreSQL version for consistent CI runs
  2. Use GitHub secrets for database credentials

Apply this diff:

      postgres:
-        image: postgres
+        image: postgres:15.4
        env:
-          POSTGRES_PASSWORD: postgres
+          POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}

Likely invalid or redundant comment.

sql/postgresql/3.68.0.sql (5)

3-3: LGTM! Consistent timestamp precision

Good practice: All tables consistently use timestamp(3) for temporal columns.

Also applies to: 33-33, 55-56, 81-82


651-663: 🛠️ Refactor suggestion

Add index on email field for user lookups

The users table should have an index on the email field to optimize authentication and password reset flows.

Add this index:

+CREATE INDEX users_idx_email ON users (email);

Likely invalid or redundant comment.


474-474: ⚠️ Potential issue

Enhance security for registry auth_token

The auth_token is stored as plain text. Consider encrypting sensitive credentials.

Apply this diff:

-  auth_token varchar(256) DEFAULT NULL
+  auth_token_encrypted varchar(512) DEFAULT NULL

Likely invalid or redundant comment.


441-449: 🛠️ Refactor suggestion

Add TTL mechanism for proxy cache

The proxy_caches table lacks expiration/TTL functionality, which could lead to serving stale data.

Add TTL columns:

   file_type varchar(30) NOT NULL DEFAULT '',
-  file_path varchar(512) NOT NULL DEFAULT ''
+  file_path varchar(512) NOT NULL DEFAULT '',
+  expires_at timestamp(3) NOT NULL,
+  last_accessed_at timestamp(3) NOT NULL

Likely invalid or redundant comment.


682-691: ⚠️ Potential issue

Add unique constraint on credential_id

The credential_id in webauthn_credentials table should be unique to prevent credential reuse across users.

Add this unique index:

+CREATE UNIQUE INDEX webauthn_credentials_uk_credential_id ON webauthn_credentials (credential_id);

Likely invalid or redundant comment.

sql/postgresql/3.68.0.sql Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (8)
docs/deploy-in-docker.md (1)

16-21: Enhance security recommendations for database configuration

While the documentation clearly shows how to configure the database connection, it should emphasize security best practices:

  1. Use environment variables or secrets management for sensitive data
  2. Avoid using default ports in production
  3. Consider adding SSL/TLS configuration options

Add the following security-related environment variables to both MySQL and PostgreSQL sections:

 CNPMCORE_DATABASE_PASSWORD=your-db-user-password
+CNPMCORE_DATABASE_SSL=true
+CNPMCORE_DATABASE_SSL_CA=/path/to/ca.pem

Also applies to: 27-32

sql/ddl_postgresql.sql (4)

1-12: Consider adding ON DELETE CASCADE constraints for better referential integrity.

The binaries table looks well-structured, but consider adding foreign key constraints to maintain data integrity across related tables.


237-275: Consider adding partial indexes for optimizing download statistics queries.

The package_version_downloads table stores daily download counts. Consider adding partial indexes to optimize common queries:

  1. Queries for specific time ranges
  2. Queries for top downloaded packages
 CREATE INDEX package_version_downloads_idx_year_month ON package_version_downloads (year_month);
 CREATE INDEX package_version_downloads_idx_packageid_yearmonth ON package_version_downloads (package_id, year_month);
+CREATE INDEX package_version_downloads_idx_recent ON package_version_downloads (year_month, package_id)
+  WHERE year_month >= date_part('year', CURRENT_DATE) * 100 + date_part('month', CURRENT_DATE) - 1;

410-422: Add indexes for common query patterns on packages table.

The packages table would benefit from additional indexes to optimize common queries:

  1. Queries filtering private packages
  2. Queries for packages by registry
 CREATE UNIQUE INDEX packages_uk_package_id ON packages (package_id);
 CREATE UNIQUE INDEX packages_uk_scope_name ON packages (scope, name);
+CREATE INDEX packages_idx_registry ON packages (registry_id) WHERE registry_id IS NOT NULL;
+CREATE INDEX packages_idx_private ON packages (is_private) WHERE is_private = true;

574-591: Consider using PostgreSQL-specific features for token management.

The tokens table could benefit from PostgreSQL-specific features:

  1. Use CIDR type for IP whitelist instead of JSON
  2. Use TIMESTAMPTZ for timezone-aware timestamps
-  cidr_whitelist json NOT NULL,
+  cidr_whitelist cidr[] NOT NULL,
-  expired_at timestamp(3) DEFAULT NULL,
-  last_used_at timestamp(3) DEFAULT NULL
+  expired_at timestamptz DEFAULT NULL,
+  last_used_at timestamptz DEFAULT NULL
test/port/controller/PackageVersionFileController/listFiles.test.ts (2)

188-324: Consider extracting expected test data to fixtures.

The test contains a large amount of inline test data. Consider moving this to separate fixture files to improve maintainability.

+// test/fixtures/package_files/root_directory.json
+{
+  "mysql": {
+    "path": "/",
+    "type": "directory",
+    "files": [
+      // ... existing test data ...
+    ]
+  },
+  "postgresql": {
+    "fileCount": 20
+  }
+}

 // In test file
-if (database.type === DATABASE_TYPE.PostgreSQL) {
-  assert.equal(res.body.files.length, 20);
-} else {
-  assert.deepEqual(res.body, {
-    // ... large test data ...
-  });
-}
+const expectedData = require('../../fixtures/package_files/root_directory.json');
+if (database.type === DATABASE_TYPE.PostgreSQL) {
+  assert.equal(res.body.files.length, expectedData.postgresql.fileCount);
+} else {
+  assert.deepEqual(res.body, expectedData.mysql);
+}

334-414: Consider using test data builders for consistent test data.

Similar to the previous comment, consider using test data builders to generate consistent test data across database types.

// test/helpers/builders/PackageFileBuilder.ts
class PackageFileBuilder {
  static buildDirectoryListing(type: DatabaseType, directory: string) {
    if (type === DatabaseType.PostgreSQL) {
      return {
        fileCount: directory === '/id' ? 10 : 20
      };
    }
    return {
      // ... existing test data ...
    };
  }
}
app/core/service/PackageManagerService.ts (1)

Based on the search results, I can now provide a final response about the transaction handling verification:

Transaction handling needs improvement for PostgreSQL compatibility

The codebase shows some transaction handling patterns that need attention:

  • In PackageRepository.ts, there's a comment "FIXME: transaction is not the options" suggesting incorrect transaction handling
  • Multiple Promise.all() operations in PackageManagerService.ts lack transaction wrapping:
    • Lines 267-271: Multiple dist saves
    • Lines 534-539: Multiple dist destroys
    • Lines 708-711: Multiple manifest operations

Recommendations:

  • Wrap Promise.all() operations in PackageManagerService.ts with transactions
  • Fix transaction parameter handling in PackageRepository.ts
  • Ensure all multi-table operations are wrapped in transactions
🔗 Analysis chain

Line range hint 1-1000: Verify transaction handling across database operations

The service uses repositories for database operations, which is good for database abstraction. However, we should verify that transaction handling works correctly with PostgreSQL, especially for operations that modify multiple tables.

Key areas to verify:

  1. Package publishing (involves multiple tables)
  2. Package unpublishing
  3. Version removal operations
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for transaction handling in repository implementations
echo "Searching for transaction handling in repositories..."
rg -A 5 "transaction|beginTransaction|commit|rollback" --type ts

# Look for potential places where transactions might be needed
echo "Checking for potential transaction boundaries..."
ast-grep --pattern 'await Promise.all([
  $$$
])'

Length of output: 20903

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 6871ad0 and dae1814.

📒 Files selected for processing (16)
  • .github/workflows/nodejs.yml (3 hunks)
  • DEVELOPER.md (5 hunks)
  • Dockerfile (1 hunks)
  • app/core/service/PackageManagerService.ts (3 hunks)
  • app/core/service/PackageVersionFileService.ts (2 hunks)
  • app/repository/TaskRepository.ts (2 hunks)
  • app/repository/util/ErrorUtil.ts (1 hunks)
  • docs/deploy-in-docker.md (3 hunks)
  • docs/elasticsearch-setup.md (4 hunks)
  • package.json (5 hunks)
  • prepare-database-postgresql.sh (1 hunks)
  • sql/ddl_postgresql.sql (1 hunks)
  • sql/postgresql/3.68.0.sql (1 hunks)
  • test/port/controller/PackageVersionFileController/listFiles.test.ts (4 hunks)
  • test/port/controller/TokenController/createToken.test.ts (1 hunks)
  • test/repository/ChangeRepository.test.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • Dockerfile
  • test/port/controller/TokenController/createToken.test.ts
  • test/repository/ChangeRepository.test.ts
  • app/core/service/PackageVersionFileService.ts
  • app/repository/util/ErrorUtil.ts
  • prepare-database-postgresql.sh
  • app/repository/TaskRepository.ts
  • docs/elasticsearch-setup.md
  • sql/postgresql/3.68.0.sql
🧰 Additional context used
🪛 actionlint (1.7.4)
.github/workflows/nodejs.yml

70-70: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🔇 Additional comments (12)
package.json (2)

116-116: 🛠️ Refactor suggestion

Update pg package to latest version

The pg package version 8.13.1 is outdated. The latest version is 8.11.3 which includes important bug fixes and performance improvements.

-    "pg": "^8.13.1",
+    "pg": "^8.11.3",

Likely invalid or redundant comment.


96-96: Verify compatibility with egg upgrade

The egg dependency has been upgraded from ^3.9.2 to ^3.29.0, which is a significant jump in versions.

✅ Verification successful

Let me analyze the changes between egg v3.9.2 and v3.29.0 by checking more details about breaking changes.


Let me check the configuration files and test utilities to understand how egg is being used.


Based on the analysis of the changes between egg v3.9.2 and v3.29.0, and the codebase usage, I can now provide the final response:

Egg upgrade appears safe to proceed

The major changes in egg between v3.9.2 and v3.29.0 are related to:

  • HTTP client improvements (HTTP/2 support, safeCurl)
  • Body parser enhancements
  • Server timeout configurations

Your codebase already has compatible configurations:

  • Using useHttpClientNext: true and allowH2: true in httpclient config
  • Custom body parser configuration with specific limits and ignore paths
  • No direct dependency on deprecated features
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for breaking changes in egg upgrade
gh api graphql -f query='
{
  repository(owner: "eggjs", name: "egg") {
    releases(first: 10, orderBy: {field: CREATED_AT, direction: DESC}) {
      nodes {
        tagName
        description
      }
    }
  }
}'

Length of output: 4053


Script:

#!/bin/bash
# Check for egg usage in the codebase
rg -l "require\(['\"](egg|@eggjs)" -g '!package*.json'

# Check for egg configuration files
fd "config\.(default|local|prod|unittest)\..*" --type f

# Check for egg plugins in package.json
cat package.json | jq '.dependencies | keys[] | select(contains("egg-"))'

Length of output: 612


Script:

#!/bin/bash
# Check egg configuration and usage
cat config/config.default.ts
cat test/TestUtil.ts

# Check for specific egg features that had changes
rg "httpclient|httpClient|bodyParser|serverTimeout" -g '!node_modules'

Length of output: 22064

.github/workflows/nodejs.yml (3)

19-33: Pin PostgreSQL version and secure credentials

Two important improvements needed for the PostgreSQL service configuration:

  1. Specify an exact PostgreSQL version to ensure CI consistency
  2. Use GitHub secrets for the database password

34-39: Pin Redis version

For consistency across CI runs, specify an exact Redis version.


69-72: 🛠️ Refactor suggestion

Update codecov-action to v4

The codecov-action should be updated to the latest version for improved stability and features.

-      uses: codecov/codecov-action@v3
+      uses: codecov/codecov-action@v4

Likely invalid or redundant comment.

🧰 Tools
🪛 actionlint (1.7.4)

70-70: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

DEVELOPER.md (4)

5-5: LGTM! Clear documentation of database support.

The documentation clearly states that either MySQL or PostgreSQL can be used as the database backend.


17-17: LGTM! Clear reference to setup documentation.

The documentation properly links to both MySQL and PostgreSQL setup guides.


Line range hint 27-66: LGTM! Well-structured development setup instructions.

The documentation provides clear and separate sections for MySQL and PostgreSQL development environments, including initialization and service startup commands.


292-294: LGTM! Clear model modification requirements.

The documentation correctly outlines the files that need to be modified when making model changes, ensuring consistency across both MySQL and PostgreSQL schemas.

test/port/controller/PackageVersionFileController/listFiles.test.ts (1)

7-7: LGTM! Proper import of database configuration.

The import of database configuration is correctly added to support database-specific test assertions.

app/core/service/PackageManagerService.ts (2)

13-13: LGTM: Good abstraction for database-agnostic error handling

The addition of isDuplicateKeyError utility and pMap improves the code by:

  • Making error handling database-agnostic (supporting both MySQL and PostgreSQL)
  • Enabling concurrent processing of dependency version checks

Also applies to: 27-27


Line range hint 275-279: LGTM: Robust database-agnostic error handling

The error handling implementation:

  • Correctly uses the database-agnostic isDuplicateKeyError utility
  • Maintains clear error messages for duplicate versions
  • Properly propagates other errors

@fengmk2 fengmk2 merged commit f240799 into master Nov 30, 2024
16 of 18 checks passed
@fengmk2 fengmk2 deleted the pg branch November 30, 2024 13:56
fengmk2 pushed a commit that referenced this pull request Nov 30, 2024
[skip ci]

## [3.70.0](v3.69.0...v3.70.0) (2024-11-30)

### Features

* support PostgreSQL ([#733](#733)) ([f240799](f240799))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PostgreSQL、SQLite 支持
1 participant