Skip to content

Conversation

@surpercodehang
Copy link
Contributor

@surpercodehang surpercodehang commented Nov 5, 2025

🔗 相关问题 / Related Issue

Issue 链接 / Issue Link:

  • 我已经创建了相关 Issue 并进行了讨论 / I have created and discussed the related issue
  • 这是一个微小的修改(如错别字),不需要 Issue / This is a trivial change (like typo fix) that doesn't need an issue

📋 变更类型 / Type of Change

  • 🐛 Bug 修复 / Bug fix (non-breaking change which fixes an issue)
  • ✨ 新功能 / New feature (non-breaking change which adds functionality)
  • 💥 破坏性变更 / Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📚 文档更新 / Documentation update
  • 🔧 重构 / Refactoring (no functional changes)
  • ⚡ 性能优化 / Performance improvement
  • 📦 依赖升级 / Dependency upgrade (update dependencies to newer versions)
  • 🚀 功能增强 / Feature enhancement (improve existing functionality without breaking changes)
  • 🧹 代码清理 / Code cleanup

📝 变更目的 / Purpose of the Change

当前在启动框架之后,python 插件(可以正常上传并部署)上传部署失败,未出现任何错误日志信息
问题原因在于,心跳这边为了避免所有的插件异常下线之后部署失败,添加了在_try_heart_beat_once调用时,重新注册所有插件的逻辑,但是这样会导致热加载的插件因为_try_heart_beat_once的频繁调用导致注册失败,进而导致插件部署失败。
因此添加了
1、心跳重连成功后,需要重新注册所有服务,确保服务不丢失,但为了避免覆盖热加载刚注册的服务,增加时间窗口保护(3倍心跳间隔)
2、距离上次注册时间超过保护窗口,可以注册(用于兜底,防止服务丢失)

📋 主要变更 / Brief Changelog

  • 修改了heart_beat_agent.py文件中在_try_heart_beat_once调用_registry_fitable_addresses方法时的频繁程度,
  • 关键改进点
    重连场景:心跳重连成功后立即注册,解决服务丢失问题
    热加载保护:通过时间窗口避免覆盖热加载刚注册的服务
    平衡机制:既保证服务不丢失,又避免频繁注册影响热加载
  • 工作原理
    心跳间隔默认 3000ms(3秒)
    保护窗口 = 3 × 3秒 = 9秒
    热加载扫描周期通常为 3 秒,9 秒的保护窗口足够覆盖热加载注册

🧪 验证变更 / Verifying this Change

测试步骤 / Test Steps

  1. 本地部署,可以正常上传并部署成功插件

测试覆盖 / Test Coverage

  • 我已经添加了单元测试 / I have added unit tests
  • 所有现有测试都通过 / All existing tests pass
  • 我已经进行了手动测试 / I have performed manual testing

📸 截图 / Screenshots

✅ 贡献者检查清单 / Contributor Checklist

请确保你的 Pull Request 符合以下要求 / Please ensure your Pull Request meets the following requirements:

基本要求 / Basic Requirements:

  • 确保有 GitHub Issue 对应这个变更(微小变更如错别字除外)/ Make sure there is a Github issue filed for the change (trivial changes like typos excluded)
  • 你的 Pull Request 只解决一个 Issue,没有包含其他不相关的变更 / Your PR addresses just this issue, without pulling in other changes - one PR resolves one issue
  • PR 中的每个 commit 都有有意义的主题行和描述 / Each commit in the PR has a meaningful subject line and body

代码质量 / Code Quality:

  • 我的代码遵循项目的代码规范 / My code follows the project's coding standards
  • 我已经进行了自我代码审查 / I have performed a self-review of my code
  • 我已经为复杂的代码添加了必要的注释 / I have commented my code, particularly in hard-to-understand areas

测试要求 / Testing Requirements:

  • 我已经编写了必要的单元测试来验证逻辑正确性 / I have written necessary unit-tests to verify the logic correction
  • 当存在跨模块依赖时,我尽量使用了 mock / I have used mocks when cross-module dependencies exist
  • 基础检查通过:mvn -B clean package -Dmaven.test.skip=true / Basic checks pass
  • 单元测试通过:mvn clean install / Unit tests pass

文档和兼容性 / Documentation and Compatibility:

  • 我已经更新了相应的文档 / I have made corresponding changes to the documentation
  • 如果有破坏性变更,我已经在 PR 描述中详细说明 / If there are breaking changes, I have documented them in detail
  • 我已经考虑了向后兼容性 / I have considered backward compatibility

📋 附加信息 / Additional Notes


审查者注意事项 / Reviewer Notes:

@surpercodehang surpercodehang added this to the 3.5.x milestone Nov 5, 2025
@surpercodehang surpercodehang self-assigned this Nov 5, 2025
@surpercodehang surpercodehang added type: bug A general bug in: fit Issues in FIT modules labels Nov 5, 2025
@CodeCasterX CodeCasterX modified the milestones: 3.5.x, 3.5.5 Nov 5, 2025
@CodeCasterX CodeCasterX merged commit 8186b7d into ModelEngine-Group:3.5.x Nov 5, 2025
1 check passed
@github-project-automation github-project-automation bot moved this to Done in Nova Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

in: fit Issues in FIT modules type: bug A general bug

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants