Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: agent panic when node is terminated during step execution #3331

Merged
merged 5 commits into from
Feb 5, 2024

Conversation

fernandrone
Copy link
Contributor

@fernandrone fernandrone commented Feb 5, 2024

Fixes #3330

This adds error handling on the agent's WaitStep function, on two sections where it could encounter a panic: runtime error: invalid memory address or nil pointer dereference in case it could no longer access complete information about a specific pod.

This error was found to happen if the node in which the pod was running was terminated during the step's execution.
spite active pipelines being executed on the node.

Now instead of a panic on the agent's logs and undefined behavior on the UI it will display a more helpful error message on the UI.

Additional context

We observed the bug first on v2.1.1, but tested the fix internally on top of 2.3.0.

image

@6543 6543 enabled auto-merge (squash) February 5, 2024 21:45
@6543 6543 added bug Something isn't working backend/kubernetes labels Feb 5, 2024
@6543 6543 merged commit c7467b9 into woodpecker-ci:main Feb 5, 2024
6 checks passed
@woodpecker-bot woodpecker-bot mentioned this pull request Feb 5, 2024
1 task
anbraten added a commit that referenced this pull request Mar 19, 2024
## [2.4.0](https://github.com/woodpecker-ci/woodpecker/releases/tag/2.4.0) - 2024-03-19

### 🔒 Security

- Improve security context handling
[[#3482](#3482)]
- fix(deps): update module github.com/moby/moby to v24.0.9+incompatible
[[#3323](#3323)]

### ✨ Features

- Cli setup command
[[#3384](#3384)]
- Add bitbucket datacenter (server) support
[[#2503](#2503)]
- Cli updater
[[#3382](#3382)]

### 📚 Documentation

- Delete docs for v0.15.x
[[#3508](#3508)]
- Add deployment plugin
[[#3495](#3495)]
- Bump follow-redirects and fix broken anchors
[[#3488](#3488)]
- fix: plugin doc page not found
[[#3480](#3480)]
- Documentation improvements
[[#3376](#3376)]
- fix(deps): update docs npm deps non-major
[[#3455](#3455)]
- Add "Sonatype Nexus" plugin
[[#3446](#3446)]
- Add blog post
[[#3439](#3439)]
- Add "Gradle Wrapper Validation" plugin
[[#3435](#3435)]
- Add blog post
[[#3410](#3410)]
- Extend core ideas documentation
[[#3405](#3405)]
- docs: fix contributions link
[[#3363](#3363)]
- Update/fix some docs
[[#3359](#3359)]
- chore(deps): update dependency marked to v12
[[#3325](#3325)]

### 🐛 Bug Fixes

- Fix skip setup for some general cli commands
[[#3498](#3498)]
- Move generic agent flags to cmd/agent/core
[[#3484](#3484)]
- Fix usage of WOODPECKER_DATABASE_DATASOURCE_FILE
[[#3404](#3404)]
- Set pull-request id and labels on pr-closed event
[[#3442](#3442)]
- Update org name on login
[[#3409](#3409)]
- Do not alter secret key upper-/lowercase
[[#3375](#3375)]
- fix: can't run multiple services on k8s
[[#3395](#3395)]
- Fix agent polling
[[#3378](#3378)]
- Remove empty strings from slice before parsing agent config
[[#3387](#3387)]
- Set correct link for commit
[[#3368](#3368)]
- Fix schema links
[[#3369](#3369)]
- Fix correctly handle gitlab pr closed events
[[#3362](#3362)]
- fix: update schema event_enum to remove error warning when.event
[[#3357](#3357)]
- Fix version check on next
[[#3340](#3340)]
- Ignore gitlab merge request events without code changes
[[#3338](#3338)]
- Ignore gitlab push events without commits
[[#3339](#3339)]
- Consider gitlab inherited permissions
[[#3308](#3308)]
- fix: agent panic when node is terminated during step execution
[[#3331](#3331)]

### 📈 Enhancement

- Enable golangci linter gomnd
[[#3171](#3171)]
- Apply "grpcnotrace" go build tag
[[#3448](#3448)]
- Simplify store interfaces
[[#3437](#3437)]
- Deprecate alternative names on secrets
[[#3406](#3406)]
- Store workflows/steps for blocked pipeline
[[#2757](#2757)]
- Parse email from Gitea webhook
[[#3420](#3420)]
- Replace http types on forge interface
[[#3374](#3374)]
- Prevent agent deletion when it's still running tasks
[[#3377](#3377)]
- Refactor internal services
[[#915](#915)]
- Lint for event filter and deprecate `exclude`
[[#3222](#3222)]
- Allow editing all environment variables in pipeline popups
[[#3314](#3314)]
- Parse backend options in backend
[[#3227](#3227)]
- Make agent usable for external backends
[[#3270](#3270)]
- Add no branches text
[[#3312](#3312)]
- Add loading spinner to repo list
[[#3310](#3310)]

### Misc

- Post on mastodon when releasing a new version
[[#3509](#3509)]
- chore(deps): update dependency alpine_3_18/ca-certificates to
v20240226
[[#3501](#3501)]
- fix(deps): update module github.com/google/go-github/v59 to v60
[[#3493](#3493)]
- fix(deps): update dependency @intlify/unplugin-vue-i18n to v3
[[#3492](#3492)]
- chore(deps): update dependency vue-tsc to v2
[[#3491](#3491)]
- chore(deps): update dependency eslint-config-airbnb-typescript to v18
[[#3490](#3490)]
- chore(deps): update web npm deps non-major
[[#3489](#3489)]
- fix(deps): update golang (packages)
[[#3486](#3486)]
- fix(deps): update module google.golang.org/protobuf to v1.33.0
[security]
[[#3487](#3487)]
- chore(deps): update docker.io/techknowlogick/xgo docker tag to
go-1.22.1
[[#3476](#3476)]
- chore(deps): update docker.io/golang docker tag to v1.22.1
[[#3475](#3475)]
- Update prettier version
[[#3471](#3471)]
- chore(deps): update woodpeckerci/plugin-ready-release-go docker tag to
v1.1.0 [[#3464](#3464)]
- chore(deps): lock file maintenance
[[#3465](#3465)]
- chore(deps): update postgres docker tag to v16.2
[[#3461](#3461)]
- chore(deps): update lycheeverse/lychee docker tag to v0.14.3
[[#3429](#3429)]
- fix(deps): update golang (packages)
[[#3430](#3430)]
- More `when` filters
[[#3407](#3407)]
- Apply `documentation`/`ui` label to corresponding renovate updates
[[#3400](#3400)]
- chore(deps): update dependency eslint-plugin-simple-import-sort to v12
[[#3396](#3396)]
- chore(deps): update typescript-eslint monorepo to v7 (major)
[[#3397](#3397)]
- fix(deps): update module github.com/google/go-github/v58 to v59
[[#3398](#3398)]
- chore(deps): update docker.io/techknowlogick/xgo docker tag to
go-1.22.0
[[#3392](#3392)]
- chore(deps): update docker.io/golang docker tag
[[#3391](#3391)]
- fix(deps): update golang (packages)
[[#3393](#3393)]
- chore(deps): update docker.io/woodpeckerci/plugin-docker-buildx docker
tag to v3.1.0
[[#3394](#3394)]
- Add link checking
[[#3371](#3371)]
- Apply `dependencies` label to all PRs
[[#3358](#3358)]
- chore(deps): update docker.io/woodpeckerci/plugin-docker-buildx docker
tag to v3.0.1
[[#3324](#3324)]

---------

Co-authored-by: 6543 <m.huber@kithara.com>
Co-authored-by: Anbraten <6918444+anbraten@users.noreply.github.com>
@fernandrone fernandrone deleted the fix-agent-panic branch April 16, 2024 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend/kubernetes bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

(Kubernetes backend) terminated node causes runtime error when handling step exit code
4 participants