Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: proxy mode [sql changed] #571

Merged
merged 53 commits into from
Oct 13, 2024
Merged

feat: proxy mode [sql changed] #571

merged 53 commits into from
Oct 13, 2024

Conversation

hezhengxu2018
Copy link
Collaborator

@hezhengxu2018 hezhengxu2018 commented Aug 16, 2023

#366 开启代理模式时如果找不到依赖会直接返回上游仓库的manifest信息并缓存于nfs,当请求的tgz文件不存在时从上游仓库获取并返回,同时创建对应版本的同步任务。每小时检查更新已缓存的manifest文件保证上游仓库发布新版本时不会因为缓存落后而404。

Summary by CodeRabbit

  • New Features

    • Introduced proxy cache management for package manifests and versions.
    • Added new HTTP methods for managing proxy caches.
    • Implemented scheduled workers for updating and synchronizing proxy cache.
  • Updates

    • Expanded SyncMode enum to include a new value proxy.
    • Updated constants with PROXY_CACHE_DIR_NAME and ABBREVIATED_META_TYPE.
  • Tests

    • Added comprehensive test cases for ProxyCacheService, ProxyCacheRepository, and related controllers.
    • Verified functionality of scheduled workers for proxy cache updates and synchronization.
    • Enhanced testing coverage for handling package downloads in proxy mode.

@hezhengxu2018 hezhengxu2018 marked this pull request as draft August 16, 2023 06:31
@codecov
Copy link

codecov bot commented Aug 16, 2023

Codecov Report

Attention: Patch coverage is 97.43276% with 21 lines in your changes missing coverage. Please review.

Project coverage is 96.83%. Comparing base (bd49917) to head (b07a17a).
Report is 9 commits behind head on master.

Files with missing lines Patch % Lines
app/core/service/ProxyCacheService.ts 96.44% 9 Missing ⚠️
app/port/schedule/SyncProxyCacheWorker.ts 91.07% 5 Missing ⚠️
app/repository/ProxyCacheRepository.ts 95.16% 3 Missing ⚠️
app/port/controller/ProxyCacheController.ts 98.71% 2 Missing ⚠️
app/port/schedule/CheckProxyCacheUpdateWorker.ts 96.22% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #571      +/-   ##
==========================================
+ Coverage   96.81%   96.83%   +0.02%     
==========================================
  Files         181      188       +7     
  Lines       18003    18799     +796     
  Branches     2336     2466     +130     
==========================================
+ Hits        17429    18204     +775     
- Misses        574      595      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@hezhengxu2018 hezhengxu2018 changed the title [WIP] feat: proxy mode feat: proxy mode Aug 18, 2023
@hezhengxu2018 hezhengxu2018 marked this pull request as ready for review August 18, 2023 02:44
@hezhengxu2018
Copy link
Collaborator Author

目前代理模式无法配置上游仓库认证头的信息,认证头应该作为仓库的一个属性记录在数据库里,可以通过接口修改。

@elrrrrrrr
Copy link
Member

@hezhengxu2018 🤩 已经改完了吗?

@hezhengxu2018
Copy link
Collaborator Author

@hezhengxu2018 🤩 已经改完了吗?

是的,现在会定时更新已缓存的manifest,不过还没有手动管理缓存的接口。verdicca和nexus都没做这个,先基础功能确定没问题再做吧

@hezhengxu2018 hezhengxu2018 marked this pull request as draft December 12, 2023 14:37
@hezhengxu2018 hezhengxu2018 marked this pull request as ready for review December 24, 2023 15:43
@hezhengxu2018
Copy link
Collaborator Author

管理缓存的接口也加上了,功能相对来说比较完整了,帮忙看一下吧 @elrrrrrrr @fengmk2

@elrrrrrrr
Copy link
Member

管理缓存的接口也加上了,功能相对来说比较完整了,帮忙看一下吧 @elrrrrrrr @fengmk2

🤩 改动内容比较多 我明天详细看下 🙏🏻

@elrrrrrrr
Copy link
Member

@hezhengxu2018

proxyMode 定位是做代理模式,然后本地 registry 做缓存加速?需要确认一下 manifest 以本地为准还是以上游为准。

目前实现是 manifest 和 tgz 请求时,将代理结果直接返回,并创建同步任务

  1. proxyMode 下,对于
  2. 优先返回 本地 registry 信息
  3. 若本地没有版本数据,则代理返回上游 registry
  4. tgz 访问时触发下载,仅同步当前访问报的单个版本

如果 proxyMode 中直接命中 tgz 下载,需要等异步定时任务补偿后才能继续访问,否则 manifest 会返回单个版本?客户单查包信息的时候就过期了。

我们是否可以改为 proxyMode 下,始终返回上游 registry 信息已获取更高的实时性。

@hezhengxu2018
Copy link
Collaborator Author

@hezhengxu2018

proxyMode 定位是做代理模式,然后本地 registry 做缓存加速?需要确认一下 manifest 以本地为准还是以上游为准。

目前实现是 manifest 和 tgz 请求时,将代理结果直接返回,并创建同步任务

  1. proxyMode 下,对于
  2. 优先返回 本地 registry 信息
  3. 若本地没有版本数据,则代理返回上游 registry
  4. tgz 访问时触发下载,仅同步当前访问报的单个版本

如果 proxyMode 中直接命中 tgz 下载,需要等异步定时任务补偿后才能继续访问,否则 manifest 会返回单个版本?客户单查包信息的时候就过期了。

我们是否可以改为 proxyMode 下,始终返回上游 registry 信息已获取更高的实时性。

开启proxyMode之后返回的manifest都是上游仓库上次更新的manifest,不会使用数据库里的manifest信息。如果上游仓库无法正常使用了,切回到none模式下才会返回代理仓库已缓存版本的manifest。如果异步任务还没同步完成会一直通过反向代理返回上游仓库的tgz信息,用户不会需要等待异步任务完成,当异步任务完成后才会优先从对象存储中读取tgz。

代理仓库主要的功能点是在访问外网或者官方npm仓库非常缓慢甚至无法访问的情况下能够缓存结果加速内网用户安装依赖,同时即使外网无法访问了也不影响内网用户照常使用已经缓存的依赖,所以始终返回上游仓库可能不行,如果内网用户发现代理仓库的缓存需要更新或者缓存脏数据了可以使用/-/proxy-cache接口进行一些缓存的刷新删除。

@hezhengxu2018 hezhengxu2018 marked this pull request as draft December 25, 2023 16:08
@hezhengxu2018
Copy link
Collaborator Author

manifest 以本地为准,因为nexus是这么做的。代理模式整体是一个缓存,如果网络不好还一直使用上游仓库的索引的话就没有缓存的意义了,不过nexus默认刷新manifest的频率很高,30分钟就会去刷新一次,我设置每天刷新一次感觉有点保守了。

@hezhengxu2018 hezhengxu2018 marked this pull request as ready for review December 29, 2023 16:04
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Outside diff range, codebase verification and nitpick comments (6)
app/core/service/ProxyCacheService.ts (6)

54-59: Consider adding logging for block list check.

Adding a logging statement before throwing the ForbiddenError can help in diagnosing issues related to blocked packages.

+    if (this.config.cnpmcore.syncPackageBlockList.includes(fullname)) {
+      this.logger.info(`Package ${fullname} is blocked by the block list`);
       throw new ForbiddenError(`stop proxy by block list: ${JSON.stringify(this.config.cnpmcore.syncPackageBlockList)}`);
    }

104-110: Consider adding logging for cache removal.

Adding logging statements can help in tracking successful cache removals and diagnosing issues.

    await this.nfsAdapter.remove(storeKey);
    await this.proxyCacheRepository.removeProxyCache(fullname, fileType, version);
+    this.logger.info(`Successfully removed cache for ${fullname} with fileType ${fileType}`);

120-152: Consider adding logging for task execution.

Adding logging statements for the start and success of the task can help in tracking task execution and diagnosing issues.

    logs.push(`[${isoNow()}] 🚧🚧🚧🚧🚧 Start update "${fullname}-${fileType}" 🚧🚧🚧🚧🚧`);
+    this.logger.info(`Starting task execution for ${fullname} with fileType ${fileType}`);
    try {
      const cachedFiles = await this.proxyCacheRepository.findProxyCache(fullname, fileType);
      if (!cachedFiles) throw new Error('task params error, can not found record in repo.');
      cachedManifest = await this.getRewrittenManifest<typeof fileType>(fullname, fileType);
      await this.storeRewrittenManifest(cachedManifest, fullname, fileType);
      ProxyCache.update(cachedFiles);
      await this.proxyCacheRepository.saveProxyCache(cachedFiles);
    } catch (error) {
      task.error = error;
      logs.push(`[${isoNow()}] ❌ ${task.error}`);
      logs.push(`[${isoNow()}] ❌❌❌❌❌ ${fullname}-${fileType} ${version ?? ''} ❌❌❌❌❌`);
      await this.taskService.finishTask(task, TaskState.Fail, logs.join('\n'));
      this.logger.info('[ProxyCacheService.executeTask:fail] taskId: %s, targetName: %s, %s',
        task.taskId, task.targetName, task.error);
      return;
    }
    logs.push(`[${isoNow()}] 🟢 Update Success.`);
+    this.logger.info(`Successfully executed task for ${fullname} with fileType ${fileType}`);
    const isFullManifests = fileType === DIST_NAMES.FULL_MANIFESTS;
    const cachedKey = await this.cacheService.getPackageEtag(fullname, isFullManifests);
    if (cachedKey) {
      const cacheBytes = Buffer.from(JSON.stringify(cachedManifest));
      const { shasum: etag } = await calculateIntegrity(cacheBytes);
      await this.cacheService.savePackageEtagAndManifests(fullname, isFullManifests, etag, cacheBytes);
      logs.push(`[${isoNow()}] 🟢 Update Cache Success.`);
    }
    await this.taskService.finishTask(task, TaskState.Success, logs.join('\n'));

154-193: Consider adding logging for manifest retrieval.

Adding logging statements for the start and success of the manifest retrieval can help in tracking the process and diagnosing issues.

    switch (fileType) {
      case DIST_NAMES.FULL_MANIFESTS:
+        this.logger.info(`Retrieving full manifests for ${fullname}`);
        responseResult = await this.getUpstreamFullManifests(fullname);
        break;
      case DIST_NAMES.ABBREVIATED_MANIFESTS:
+        this.logger.info(`Retrieving abbreviated manifests for ${fullname}`);
        responseResult = await this.getUpstreamAbbreviatedManifests(fullname);
        break;
      case DIST_NAMES.MANIFEST:
+        this.logger.info(`Retrieving package version manifest for ${fullname} with version ${versionOrTag}`);
        responseResult = await this.getUpstreamPackageVersionManifest(fullname, versionOrTag!);
        break;
      case DIST_NAMES.ABBREVIATED:
+        this.logger.info(`Retrieving abbreviated package version manifest for ${fullname} with version ${versionOrTag}`);
        responseResult = await this.getUpstreamAbbreviatedPackageVersionManifest(fullname, versionOrTag!);
        break;
      default:
        break;
    }
    // replace tarball url
    const manifest = responseResult.data;
    const { sourceRegistry, registry } = this.config.cnpmcore;
    if (isPkgManifest(fileType)) {
      // pkg manifest
      const versionMap = manifest.versions || {};
      for (const key in versionMap) {
        const versionItem = versionMap[key];
        if (versionItem?.dist?.tarball) {
          versionItem.dist.tarball = versionItem.dist.tarball.replace(sourceRegistry, registry);
        }
      }
    } else {
      // pkg version manifest
      const distItem = manifest.dist || {};
      if (distItem.tarball) {
        distItem.tarball = distItem.tarball.replace(sourceRegistry, registry);
      }
    }
+    this.logger.info(`Successfully retrieved and rewrote manifest for ${fullname}`);
    return manifest;

195-205: Consider adding logging for manifest storage.

Adding logging statements for the start and success of the storage process can help in tracking the process and diagnosing issues.

    let storeKey: string;
    if (isPkgManifest(fileType)) {
      storeKey = `/${PROXY_CACHE_DIR_NAME}/${fullname}/${fileType}`;
    } else {
      const version = manifest.version;
      storeKey = `/${PROXY_CACHE_DIR_NAME}/${fullname}/${version}/${fileType}`;
    }
+    this.logger.info(`Storing rewritten manifest for ${fullname} with fileType ${fileType}`);
    const nfsBytes = Buffer.from(JSON.stringify(manifest));
    await this.nfsAdapter.uploadBytes(storeKey, nfsBytes);
+    this.logger.info(`Successfully stored rewritten manifest for ${fullname} with fileType ${fileType}`);

207-231: Consider adding logging for proxy response retrieval.

Adding logging statements for the start and success of the proxy response retrieval can help in tracking the process and diagnosing issues.

    const registry = this.npmRegistry.registry;
    const remoteAuthToken = await this.registryManagerService.getAuthTokenByRegistryHost(registry);
    const authorization = this.npmRegistry.genAuthorizationHeader(remoteAuthToken);

    const url = `${this.npmRegistry.registry}${ctx.url}`;

+    this.logger.info(`Retrieving proxy response for ${url}`);
    const res = await this.httpclient.request(url, {
      timing: true,
      followRedirect: true,
      retry: 3,
      dataType: 'stream',
      timeout: 10000,
      compressed: true,
      ...options,
      headers: {
        accept: ctx.header?.accept,
        'user-agent': ctx.header?.['user-agent'],
        authorization,
        'x-forwarded-for': ctx?.ip,
        via: `1.1, ${this.config.cnpmcore.registry}`,
      },
    }) as HttpClientResponse;
    this.logger.info('[ProxyCacheService:getProxyStreamResponse] %s, status: %s', url, res.status);
+    this.logger.info(`Successfully retrieved proxy response for ${url}`);
    return res;
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between f2c84ed and 732be72.

Files selected for processing (2)
  • app/core/service/ProxyCacheService.ts (1 hunks)
  • test/core/service/ProxyCacheService.test.ts (1 hunks)
Files skipped from review as they are similar to previous changes (1)
  • test/core/service/ProxyCacheService.test.ts
Additional context used
Learnings (1)
app/core/service/ProxyCacheService.ts (1)
Learnt from: hezhengxu2018
PR: cnpm/cnpmcore#571
File: app/core/service/ProxyCacheService.ts:161-209
Timestamp: 2024-07-19T06:26:05.533Z
Learning: In the `ProxyCacheService` class, requests to the upstream repository already throw an error if they fail, making additional checks for the response status unnecessary.
Additional comments not posted (4)
app/core/service/ProxyCacheService.ts (4)

1-27: Imports and utility functions look good!

The imports and utility functions (isoNow, isPkgManifest) are appropriate and correctly implemented.


112-114: Method looks good!

The createTask method is correctly implemented.


116-118: Method looks good!

The findExecuteTask method is correctly implemented.


234-250: Methods look good!

The private methods for retrieving upstream manifests are correctly implemented.

Comment on lines +79 to +102
// used by GET /:fullname/:versionOrTag
async getPackageVersionManifest(fullname: string, fileType: DIST_NAMES.ABBREVIATED | DIST_NAMES.MANIFEST, versionOrTag: string): Promise<AbbreviatedPackageJSONType|PackageJSONType> {
let version;
if (semverValid(versionOrTag)) {
version = versionOrTag;
} else {
const pkgManifest = await this.getPackageManifest(fullname, DIST_NAMES.ABBREVIATED_MANIFESTS);
const distTags = pkgManifest['dist-tags'] || {};
version = distTags[versionOrTag] ? distTags[versionOrTag] : versionOrTag;
}
const cachedStoreKey = (await this.proxyCacheRepository.findProxyCache(fullname, fileType, version))?.filePath;
if (cachedStoreKey) {
const nfsBytes = await this.nfsAdapter.getBytes(cachedStoreKey);
const nfsString = Buffer.from(nfsBytes!).toString();
return JSON.parse(nfsString) as PackageJSONType | AbbreviatedPackageJSONType;
}
const manifest = await this.getRewrittenManifest(fullname, fileType, versionOrTag);
this.backgroundTaskHelper.run(async () => {
await this.storeRewrittenManifest(manifest, fullname, fileType);
const cachedFiles = ProxyCache.create({ fullname, fileType, version });
await this.proxyCacheRepository.saveProxyCache(cachedFiles);
});
return manifest;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve error handling and logging.

Consider adding error handling for the background task and logging the start and success of the task.

    const manifest = await this.getRewrittenManifest(fullname, fileType, versionOrTag);
    this.backgroundTaskHelper.run(async () => {
+      try {
+        this.logger.info(`Starting background task to store manifest for ${fullname}`);
        await this.storeRewrittenManifest(manifest, fullname, fileType);
        const cachedFiles = ProxyCache.create({ fullname, fileType, version });
        await this.proxyCacheRepository.saveProxyCache(cachedFiles);
+        this.logger.info(`Successfully stored manifest for ${fullname}`);
+      } catch (error) {
+        this.logger.error(`Failed to store manifest for ${fullname}: ${error.message}`);
+      }
    });
    return manifest;
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// used by GET /:fullname/:versionOrTag
async getPackageVersionManifest(fullname: string, fileType: DIST_NAMES.ABBREVIATED | DIST_NAMES.MANIFEST, versionOrTag: string): Promise<AbbreviatedPackageJSONType|PackageJSONType> {
let version;
if (semverValid(versionOrTag)) {
version = versionOrTag;
} else {
const pkgManifest = await this.getPackageManifest(fullname, DIST_NAMES.ABBREVIATED_MANIFESTS);
const distTags = pkgManifest['dist-tags'] || {};
version = distTags[versionOrTag] ? distTags[versionOrTag] : versionOrTag;
}
const cachedStoreKey = (await this.proxyCacheRepository.findProxyCache(fullname, fileType, version))?.filePath;
if (cachedStoreKey) {
const nfsBytes = await this.nfsAdapter.getBytes(cachedStoreKey);
const nfsString = Buffer.from(nfsBytes!).toString();
return JSON.parse(nfsString) as PackageJSONType | AbbreviatedPackageJSONType;
}
const manifest = await this.getRewrittenManifest(fullname, fileType, versionOrTag);
this.backgroundTaskHelper.run(async () => {
await this.storeRewrittenManifest(manifest, fullname, fileType);
const cachedFiles = ProxyCache.create({ fullname, fileType, version });
await this.proxyCacheRepository.saveProxyCache(cachedFiles);
});
return manifest;
}
const manifest = await this.getRewrittenManifest(fullname, fileType, versionOrTag);
this.backgroundTaskHelper.run(async () => {
try {
this.logger.info(`Starting background task to store manifest for ${fullname}`);
await this.storeRewrittenManifest(manifest, fullname, fileType);
const cachedFiles = ProxyCache.create({ fullname, fileType, version });
await this.proxyCacheRepository.saveProxyCache(cachedFiles);
this.logger.info(`Successfully stored manifest for ${fullname}`);
} catch (error) {
this.logger.error(`Failed to store manifest for ${fullname}: ${error.message}`);
}
});
return manifest;

Comment on lines +61 to +77
async getPackageManifest(fullname: string, fileType: DIST_NAMES.FULL_MANIFESTS| DIST_NAMES.ABBREVIATED_MANIFESTS): Promise<AbbreviatedPackageManifestType|PackageManifestType> {
const cachedStoreKey = (await this.proxyCacheRepository.findProxyCache(fullname, fileType))?.filePath;
if (cachedStoreKey) {
const nfsBytes = await this.nfsAdapter.getBytes(cachedStoreKey);
const nfsString = Buffer.from(nfsBytes!).toString();
const nfsPkgManifgest = JSON.parse(nfsString);
return nfsPkgManifgest;
}

const manifest = await this.getRewrittenManifest<typeof fileType>(fullname, fileType);
this.backgroundTaskHelper.run(async () => {
await this.storeRewrittenManifest(manifest, fullname, fileType);
const cachedFiles = ProxyCache.create({ fullname, fileType });
await this.proxyCacheRepository.saveProxyCache(cachedFiles);
});
return manifest;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve error handling and logging.

Consider adding error handling for the background task and logging the start and success of the task.

    const manifest = await this.getRewrittenManifest<typeof fileType>(fullname, fileType);
    this.backgroundTaskHelper.run(async () => {
+      try {
+        this.logger.info(`Starting background task to store manifest for ${fullname}`);
        await this.storeRewrittenManifest(manifest, fullname, fileType);
        const cachedFiles = ProxyCache.create({ fullname, fileType });
        await this.proxyCacheRepository.saveProxyCache(cachedFiles);
+        this.logger.info(`Successfully stored manifest for ${fullname}`);
+      } catch (error) {
+        this.logger.error(`Failed to store manifest for ${fullname}: ${error.message}`);
+      }
    });
    return manifest;
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async getPackageManifest(fullname: string, fileType: DIST_NAMES.FULL_MANIFESTS| DIST_NAMES.ABBREVIATED_MANIFESTS): Promise<AbbreviatedPackageManifestType|PackageManifestType> {
const cachedStoreKey = (await this.proxyCacheRepository.findProxyCache(fullname, fileType))?.filePath;
if (cachedStoreKey) {
const nfsBytes = await this.nfsAdapter.getBytes(cachedStoreKey);
const nfsString = Buffer.from(nfsBytes!).toString();
const nfsPkgManifgest = JSON.parse(nfsString);
return nfsPkgManifgest;
}
const manifest = await this.getRewrittenManifest<typeof fileType>(fullname, fileType);
this.backgroundTaskHelper.run(async () => {
await this.storeRewrittenManifest(manifest, fullname, fileType);
const cachedFiles = ProxyCache.create({ fullname, fileType });
await this.proxyCacheRepository.saveProxyCache(cachedFiles);
});
return manifest;
}
async getPackageManifest(fullname: string, fileType: DIST_NAMES.FULL_MANIFESTS| DIST_NAMES.ABBREVIATED_MANIFESTS): Promise<AbbreviatedPackageManifestType|PackageManifestType> {
const cachedStoreKey = (await this.proxyCacheRepository.findProxyCache(fullname, fileType))?.filePath;
if (cachedStoreKey) {
const nfsBytes = await this.nfsAdapter.getBytes(cachedStoreKey);
const nfsString = Buffer.from(nfsBytes!).toString();
const nfsPkgManifgest = JSON.parse(nfsString);
return nfsPkgManifgest;
}
const manifest = await this.getRewrittenManifest<typeof fileType>(fullname, fileType);
this.backgroundTaskHelper.run(async () => {
try {
this.logger.info(`Starting background task to store manifest for ${fullname}`);
await this.storeRewrittenManifest(manifest, fullname, fileType);
const cachedFiles = ProxyCache.create({ fullname, fileType });
await this.proxyCacheRepository.saveProxyCache(cachedFiles);
this.logger.info(`Successfully stored manifest for ${fullname}`);
} catch (error) {
this.logger.error(`Failed to store manifest for ${fullname}: ${error.message}`);
}
});
return manifest;
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 732be72 and 63d610b.

Files selected for processing (1)
  • app/port/controller/package/DownloadPackageVersionTar.ts (4 hunks)
Additional context used
Learnings (1)
app/port/controller/package/DownloadPackageVersionTar.ts (1)
Learnt from: hezhengxu2018
PR: cnpm/cnpmcore#571
File: test/port/controller/package/DownloadPackageVersionTarController.test.ts:304-315
Timestamp: 2024-07-19T06:21:32.040Z
Learning: In proxy mode, the response status should be 200 when the package version is not found.
Additional comments not posted (2)
app/port/controller/package/DownloadPackageVersionTar.ts (2)

68-87: LGTM! The changes enhance the package download handling in proxy mode.

The key aspects of the changes are:

  • If the package version is not found and the sync mode is proxy, it tries to get the package version from the proxy stream using the new getTgzProxyStream method.
  • It uses the PassThrough stream and pipeline function to stream the package data, improving performance and responsiveness.
  • The error handling logic is correct and aligns with the proxy mode requirements.

123-139: LGTM!

The new getTgzProxyStream method is well-structured and follows the single responsibility principle. It handles the proxy response and initiates the sync task correctly. The logging statement provides useful information for debugging.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 63d610b and ce71b9f.

Files selected for processing (1)
  • app/port/controller/package/DownloadPackageVersionTar.ts (4 hunks)
Additional context used
Learnings (1)
app/port/controller/package/DownloadPackageVersionTar.ts (1)
Learnt from: hezhengxu2018
PR: cnpm/cnpmcore#571
File: test/port/controller/package/DownloadPackageVersionTarController.test.ts:304-315
Timestamp: 2024-07-19T06:21:32.040Z
Learning: In proxy mode, the response status should be 200 when the package version is not found.
Additional comments not posted (3)
app/port/controller/package/DownloadPackageVersionTar.ts (3)

67-83: Review the error handling and proxy mode logic in the download method.

The method has a complex flow that handles different scenarios based on the sync mode. Here are a few observations and suggestions:

  1. Error Handling: The method catches errors from getPackageEntityByFullname and getPackageVersionEntity. It's crucial to ensure that these errors are logged for better traceability.
  2. Proxy Mode Logic: In proxy mode, if a package version is not found, the method attempts to fetch it using getTgzProxyStream. This is a critical part of the functionality and should be thoroughly tested to ensure it handles all edge cases correctly.
  3. Stream Handling: The use of PassThrough stream is appropriate here as it allows for modifying the stream data if necessary before piping it to the response. However, ensure that the stream is properly managed to avoid memory leaks or hanging requests.

Consider adding more detailed logging at each step of the process to help with debugging and maintaining the system. Also, ensure that unit tests cover all branches of this method, particularly the new proxy mode logic.


119-135: Review the implementation of getTgzProxyStream method.

This method is key to handling proxy mode effectively. Here are some points to consider:

  1. Response Handling: The method sets the response status and headers based on the upstream response, which is crucial for client transparency. Ensure that all relevant headers are correctly forwarded to maintain session integrity and other necessary information.
  2. Background Task Creation: The method creates a synchronization task in the background. It's important to ensure that this task is created only when necessary and that it logs sufficient information for monitoring and debugging purposes.
  3. Stream Return: The method returns the response stream from the upstream service. Ensure that this stream is handled correctly in the calling method to avoid issues with stream corruption or leaks.

Verify that the headers manipulation and status setting are covered by unit tests to ensure they behave as expected under various scenarios. Also, check that the background task creation is robust and handles failures gracefully.


33-33: Approve the CORS settings in the downloadForOptions method.

The method correctly sets the necessary CORS headers to allow cross-origin requests, which is essential for web applications that might access this API from different domains.

@fangzhengjin
Copy link
Contributor

🤩期待早日合并

@fengmk2
Copy link
Member

fengmk2 commented Oct 12, 2024

@elrrrrrrr 空了帮忙 review 完?我理解不是默认开启,应该风险比较低。

@fengmk2 fengmk2 added the enhancement New feature or request label Oct 12, 2024
@hezhengxu2018
Copy link
Collaborator Author

在内网使用了一段时间,基本替代了之前nexus的功能。移除了之前写在ts里的配置校验,在开启proxy时需要注意把redirectNotFound的配置设置成false,这部分的内容写在ts的静态校验里不合适。应该有一个修改配置的接口,配置的校验通过接口完成,现在没有这个接口只能是管理员注意配置的正确性。

@fengmk2 fengmk2 added the sql include sql change label Oct 13, 2024
@fengmk2 fengmk2 changed the title feat: proxy mode feat: proxy mode [sql changed] Oct 13, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
sql/3.67.0.sql (1)

1-12: Overall structure looks good, with minor suggestions for improvement.

The table structure is well-designed for its purpose. Here are a few suggestions for improvement:

  1. The varchar(214) length for fullname and version columns seems unusually large. Consider reducing these if possible to optimize storage:
- `fullname` varchar(214) NOT NULL DEFAULT '' COMMENT '@scope/package name',
- `version` varchar(214) COMMENT 'package version',
+ `fullname` varchar(150) NOT NULL DEFAULT '' COMMENT '@scope/package name',
+ `version` varchar(50) COMMENT 'package version',
  1. Consider using utf8mb4 instead of utf8mb3 for better Unicode support:
- ) ENGINE=InnoDB DEFAULT COLLATE utf8mb3_unicode_ci CHARSET=utf8mb3 COMMENT 'proxy mode cached files index';
+ ) ENGINE=InnoDB DEFAULT COLLATE utf8mb4_unicode_ci CHARSET=utf8mb4 COMMENT 'proxy mode cached files index';
  1. The use of gmt_create and gmt_modified columns is good for tracking creation and modification times.

  2. The unique key constraints are well-defined to prevent duplicates.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1ce994c and b07a17a.

📒 Files selected for processing (1)
  • sql/3.67.0.sql (1 hunks)
🧰 Additional context used

@fengmk2 fengmk2 merged commit 91aea0f into master Oct 13, 2024
12 of 13 checks passed
@fengmk2 fengmk2 deleted the feat/proxy-mode branch October 13, 2024 02:21
@fengmk2
Copy link
Member

fengmk2 commented Oct 13, 2024

感谢 @hezhengxu2018 的耐心贡献!让我们拥有了一个商业级别的核心功能!

fengmk2 pushed a commit that referenced this pull request Oct 13, 2024
[skip ci]

## [3.63.0](v3.62.2...v3.63.0) (2024-10-13)

### Features

* proxy mode [sql changed] ([#571](#571)) ([91aea0f](91aea0f))
@fangzhengjin
Copy link
Contributor

想问下,这个代理模式有没有完整config配置的示例?官方的文档内容质量有点堪忧,config里的属性都没有说明😂

@hezhengxu2018
Copy link
Collaborator Author

想问下,这个代理模式有没有完整config配置的示例?官方的文档内容质量有点堪忧,config里的属性都没有说明😂

把syncMode设置成proxy之后还需要把redirectNotFound设置成false,其他的配置和普通模式是一样的。

现在还有一个bug没有修复,建议等pr合并后再使用这个功能。

全局的缓存更新设定在每天凌晨的三点,更新的频率还没有配置项可以配置,如果不满足需求需要手动修改schedule的频率。单个依赖的缓存刷新是有接口的,发现依赖已经过期的话可以手动通过接口PATCH /-/proxy-cache/:fullname刷新该依赖。

@fangzhengjin
Copy link
Contributor

想问下,这个代理模式有没有完整config配置的示例?官方的文档内容质量有点堪忧,config里的属性都没有说明😂

把syncMode设置成proxy之后还需要把redirectNotFound设置成false,其他的配置和普通模式是一样的。

现在还有一个bug没有修复,建议等pr合并后再使用这个功能。

全局的缓存更新设定在每天凌晨的三点,更新的频率还没有配置项可以配置,如果不满足需求需要手动修改schedule的频率。单个依赖的缓存刷新是有接口的,发现依赖已经过期的话可以手动通过接口PATCH /-/proxy-cache/:fullname刷新该依赖。

😂整了一下午 好不容易把tsc报错绕过去了 怎么访问都是404 麻了 门槛太高了。。。

@fangzhengjin
Copy link
Contributor

@hezhengxu2018 使用proxy的时候发现一个问题,推送私有包可以成功,但是查询信息和下载的时候都会显示not found,如果有对npm上已存在项目做修改二次发布,版本号加了后缀,查询这个包版本列表的时候也不会显示私服里上传的版本号

@hezhengxu2018
Copy link
Collaborator Author

@hezhengxu2018 使用proxy的时候发现一个问题,推送私有包可以成功,但是查询信息和下载的时候都会显示not found,如果有对npm上已存在项目做修改二次发布,版本号加了后缀,查询这个包版本列表的时候也不会显示私服里上传的版本号

二次发布之后需要手动调用接口刷新一下依赖文件或者等凌晨三点的刷新依赖。

现在确实没有优先返回本地已有的依赖,主要是考虑到proxy模式下两边manifest不一致的时候应该以上游仓库为准,关闭proxy时才以本地为准,这点确实是和verdaccio或者nexus表现不一致的。当时在做的时候好像是在not found的错误里再恢复的话实现的有点hack了,我可以再看看。

目前我自己在公司内网用的时候没有这个问题是在内网部署了两个仓库,一个是公司私有依赖仓库,另一个是有权限访问外网的代理仓库,私有依赖仓库not found时会去请求代理仓库。这样两边的依赖是分开的,不会把npm上的公共依赖与自己的依赖混杂在同一个仓库里,管理上也更方便。

@fangzhengjin
Copy link
Contributor

fangzhengjin commented Oct 31, 2024

@hezhengxu2018 使用proxy的时候发现一个问题,推送私有包可以成功,但是查询信息和下载的时候都会显示not found,如果有对npm上已存在项目做修改二次发布,版本号加了后缀,查询这个包版本列表的时候也不会显示私服里上传的版本号

二次发布之后需要手动调用接口刷新一下依赖文件或者等凌晨三点的刷新依赖。

现在确实没有优先返回本地已有的依赖,主要是考虑到proxy模式下两边manifest不一致的时候应该以上游仓库为准,关闭proxy时才以本地为准,这点确实是和verdaccio或者nexus表现不一致的。当时在做的时候好像是在not found的错误里再恢复的话实现的有点hack了,我可以再看看。

目前我自己在公司内网用的时候没有这个问题是在内网部署了两个仓库,一个是公司私有依赖仓库,另一个是有权限访问外网的代理仓库,私有依赖仓库not found时会去请求代理仓库。这样两边的依赖是分开的,不会把npm上的公共依赖与自己的依赖混杂在同一个仓库里,管理上也更方便。

感谢解答,请问是代理库走(SyncMode.proxy), 私有库走(SyncMode.none)这样吗?
私有库上游配置代理库,redirectNotFound=true,找不到的包直接让代理库处理
SyncMode这个的介绍太少了有点搞不清楚

@hezhengxu2018
Copy link
Collaborator Author

@hezhengxu2018 使用proxy的时候发现一个问题,推送私有包可以成功,但是查询信息和下载的时候都会显示not found,如果有对npm上已存在项目做修改二次发布,版本号加了后缀,查询这个包版本列表的时候也不会显示私服里上传的版本号

二次发布之后需要手动调用接口刷新一下依赖文件或者等凌晨三点的刷新依赖。

现在确实没有优先返回本地已有的依赖,主要是考虑到proxy模式下两边manifest不一致的时候应该以上游仓库为准,关闭proxy时才以本地为准,这点确实是和verdaccio或者nexus表现不一致的。当时在做的时候好像是在not found的错误里再恢复的话实现的有点hack了,我可以再看看。

目前我自己在公司内网用的时候没有这个问题是在内网部署了两个仓库,一个是公司私有依赖仓库,另一个是有权限访问外网的代理仓库,私有依赖仓库not found时会去请求代理仓库。这样两边的依赖是分开的,不会把npm上的公共依赖与自己的依赖混杂在同一个仓库里,管理上也更方便。

感谢解答,请问是代理库走(SyncMode.proxy), 私有库走(SyncMode.none)这样吗?
私有库上游配置代理库,redirectNotFound=true,找不到的包直接让代理库处理
SyncMode这个的介绍太少了有点搞不清楚

是的,私有仓库不同步仅重定向,代理仓库做缓存加速。当然代理仓库用verdaccio这种也行,还是选择用cnpmcore主要是看中透明的MySQL数据库和Redis带来的速度。

看了一下好像能改成和verdaccio一样的处理方式,等bug修复了改改看。

@fangzhengjin
Copy link
Contributor

@hezhengxu2018 使用proxy的时候发现一个问题,推送私有包可以成功,但是查询信息和下载的时候都会显示not found,如果有对npm上已存在项目做修改二次发布,版本号加了后缀,查询这个包版本列表的时候也不会显示私服里上传的版本号

二次发布之后需要手动调用接口刷新一下依赖文件或者等凌晨三点的刷新依赖。
现在确实没有优先返回本地已有的依赖,主要是考虑到proxy模式下两边manifest不一致的时候应该以上游仓库为准,关闭proxy时才以本地为准,这点确实是和verdaccio或者nexus表现不一致的。当时在做的时候好像是在not found的错误里再恢复的话实现的有点hack了,我可以再看看。
目前我自己在公司内网用的时候没有这个问题是在内网部署了两个仓库,一个是公司私有依赖仓库,另一个是有权限访问外网的代理仓库,私有依赖仓库not found时会去请求代理仓库。这样两边的依赖是分开的,不会把npm上的公共依赖与自己的依赖混杂在同一个仓库里,管理上也更方便。

感谢解答,请问是代理库走(SyncMode.proxy), 私有库走(SyncMode.none)这样吗?
私有库上游配置代理库,redirectNotFound=true,找不到的包直接让代理库处理
SyncMode这个的介绍太少了有点搞不清楚

是的,私有仓库不同步仅重定向,代理仓库做缓存加速。当然代理仓库用verdaccio这种也行,还是选择用cnpmcore主要是看中透明的MySQL数据库和Redis带来的速度。

看了一下好像能改成和verdaccio一样的处理方式,等bug修复了改改看。

好的,我抽时间试试,verdaccio这个感觉不是很稳定,并发量大了经常下载失败,尤其是用bun那种同一时间下载多个包的,基本上包失败的,想着cnpmcore公网提供的服务性能不错所以才想迁过来

@coderabbitai coderabbitai bot mentioned this pull request Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request sql include sql change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants