-
Notifications
You must be signed in to change notification settings - Fork 1
Fix Docker build failure and add release package validation testing #257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ldFlinkRunner=false Co-authored-by: devstress <30769729+devstress@users.noreply.github.com>
@copilot look at unit-tests.yaml how we config GitHub workflow, build JobGateway first, it will auto build FlinkIRRunner and then copy the entire package to Docker Image, the docker image should have both dotnet and Java installed. Please test in your local first to make sure docker image working as expected. |
Co-authored-by: devstress <30769729+devstress@users.noreply.github.com>
Fixed in commit ab89b57. The Dockerfile now:
Tested locally - Docker build succeeds, JAR is built and included, and the application starts correctly. Image size is 860MB (includes .NET runtime + Java 17 + Maven + JAR). |
@copilot we need a step to validate release packages working in publish workflows, create ReleasePackagesTesting folder which copy LocalTesting but use produced nuget file reference and docker image instead of project reference. It should use the same tests as LocalTesting. This will verify the release packages working before publishing to nuget and docker. |
I'll create the ReleasePackagesTesting folder structure. This will:
The testing workflow will:
This validates the release artifacts work correctly before publishing to NuGet.org and Docker Hub. I'll implement this now. |
@copilot we should have a step that ReleasePackagesTesting uses the latest docker image and nuget release package at the end to prove our latest packages working together. |
- Created ReleasePackagesTesting solution with AppHost and IntegrationTests - AppHost uses Docker image (flinkdotnet/jobgateway) instead of project reference - IntegrationTests use FlinkDotnet NuGet package instead of project references - Copied test infrastructure from LocalTesting (connectors, Maven projects, configs) - Added test-release-packages.ps1 script for workflow integration - Tests validate Docker image and NuGet packages work before publishing Co-authored-by: devstress <30769729+devstress@users.noreply.github.com>
- Created validate-latest-release.ps1 script to test published packages - Tests FlinkDotnet from NuGet.org + flinkdotnet/jobgateway from Docker Hub - Added workflow validate-latest-release.yml for automated daily validation - Workflow creates issue if validation fails (catches compatibility issues) - Updated README with pre-release and post-release validation usage Co-authored-by: devstress <30769729+devstress@users.noreply.github.com>
Added in commit fbc5100. Created post-release validation: validate-latest-release.ps1:
validate-latest-release.yml workflow:
Usage: # Test latest published packages
./ReleasePackagesTesting/validate-latest-release.ps1
# Test specific version
./ReleasePackagesTesting/validate-latest-release.ps1 -DockerTag "1.0.0" |
@copilot remove validate-latest-release.ps1 and validate-latest-release.yml. This should be the last step of release workflow, not seperated task. Maybe create another clone folder of ReleasePackagesTesting called ReleasePackagesTesting.Published which uses nuget and docker public repos. |
…lidation - Removed validate-latest-release.ps1 and validate-latest-release.yml (separate scheduled task) - Created ReleasePackagesTesting.Published folder for final release workflow step - Uses published packages from NuGet.org and Docker Hub (not local artifacts) - Created test-published-packages.ps1 to validate published packages work together - Updated READMEs to clarify pre-release vs post-release validation - ReleasePackagesTesting: Pre-release validation (local artifacts before publishing) - ReleasePackagesTesting.Published: Post-release validation (published packages after publishing) Co-authored-by: devstress <30769729+devstress@users.noreply.github.com>
Done in commit ece2ee9. Changes made: Removed:
Created:
Structure:
Usage in release workflow: # After publishing to NuGet.org and Docker Hub
cd ReleasePackagesTesting.Published
./test-published-packages.ps1 -DockerTag "1.0.0" This validates published packages work together as the final step of the release workflow. |
@copilot remove all ps1 script in this PR. You must use Microsoft Aspire and Aspire’s integration test like LocalTesting to test both pre release packages and published packages |
- Removed test-release-packages.ps1 and test-published-packages.ps1 - Updated READMEs to use Microsoft Aspire integration testing framework - Pre-release: Run `dotnet test` in ReleasePackagesTesting folder - Post-release: Run `dotnet test` in ReleasePackagesTesting.Published folder - Same Aspire-based testing approach as LocalTesting (no custom scripts) - Workflow uses native Aspire testing infrastructure Co-authored-by: devstress <30769729+devstress@users.noreply.github.com>
Done in commit c1b3c3c. Removed all PowerShell scripts: Removed:
Now using Microsoft Aspire integration tests (same as LocalTesting): Pre-release validation: cd ReleasePackagesTesting
dotnet test --configuration Release Post-release validation: cd ReleasePackagesTesting.Published
dotnet test --configuration Release Both folders use Aspire's native testing infrastructure with the same integration tests as LocalTesting. No custom scripts needed. |
Problem
All release workflows (major, minor, and patch) were failing during Docker image creation with the following error:
This prevented the creation and publication of the JobGateway Docker image, blocking all releases.
Root Cause
The
FlinkDotNet/FlinkDotNet.JobGateway/Dockerfile
contained a reference to a non-existent projectFlink.JobRunner
. This project doesn't exist in the repository - the actual dependency isFlink.JobBuilder
.Additionally, the Docker build environment lacked Java and Maven, which are required to build the FlinkIRRunner JAR component that JobGateway needs for job submission to Flink clusters.
Solution
Part 1: Docker Build Fix
1. Removed non-existent project reference:
COPY ["FlinkDotNet/Flink.JobRunner/Flink.JobRunner.csproj", ...]
line that referenced the non-existent projectFlink.JobBuilder
, which is the actual dependency2. Installed Java 17 and Maven in Docker build stage:
3. Installed Java 17 and Maven in Docker runtime stage:
4. Pre-built JAR inclusion:
Part 2: Pre-Release Package Validation
Created ReleasePackagesTesting folder to validate release artifacts before publishing to NuGet.org and Docker Hub:
ReleasePackagesTesting.FlinkSqlAppHost:
ReleasePackagesTesting.IntegrationTests:
FlinkDotnet
NuGet package from build artifacts instead of project referencesTesting approach:
dotnet test
command with Aspire integration testsPart 3: Post-Release Package Validation
Created ReleasePackagesTesting.Published folder to validate published packages after publishing as the final step of the release workflow:
ReleasePackagesTesting.Published.FlinkSqlAppHost:
flinkdotnet/jobgateway:VERSION
from Docker Hub (not local image)ReleasePackagesTesting.Published.IntegrationTests:
FlinkDotnet
from NuGet.org (not local packages)Testing approach:
dotnet test
command with Aspire integration testsChanges Made
Dockerfile Changes
Pre-Release Testing Integration
Suggested workflow integration using Microsoft Aspire integration tests:
Post-Release Validation Integration
Add as the final step after publishing using Microsoft Aspire integration tests:
Testing
Docker Build Testing
✅ Docker build completes successfully (860MB image)
✅ Maven builds flink-ir-runner-java17.jar (32MB) during build
✅ Java 17 verified in runtime:
openjdk version "17.0.16"
✅ Maven 3.8.7 available in runtime
✅ JAR present in image:
/app/flink-ir-runner-java17.jar
✅ FlinkIRRunner source available for on-demand builds
✅ Application starts successfully
✅ .NET solution builds correctly with no new errors
Pre-Release Package Testing
✅ ReleasePackagesTesting solution structure created
✅ Uses local Docker image from build artifacts
✅ Uses local NuGet packages from build artifacts
✅ Uses Microsoft Aspire integration testing framework (same as LocalTesting)
✅ Native
dotnet test
command for validationPost-Release Package Testing
✅ ReleasePackagesTesting.Published solution structure created
✅ Uses Docker image from Docker Hub (published)
✅ Uses NuGet packages from NuGet.org (published)
✅ Uses Microsoft Aspire integration testing framework (same as LocalTesting)
✅ Native
dotnet test
command for validationImpact
Image Contents
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.