Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Windows e2e logging #1351

Merged

Conversation

jsturtevant
Copy link
Contributor

@jsturtevant jsturtevant commented Apr 29, 2021

What type of PR is this?
/kind feature

What this PR does / why we need it:
We can't diagnose windows CI failures with out the logs from the workload clusters.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #1286

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

Enable Windows workload log collection in e2e tests

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 29, 2021
@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Apr 29, 2021
@jsturtevant
Copy link
Contributor Author

jsturtevant commented Apr 29, 2021

This currently has the patches from #1350. I will re-base and remove once that is merged.

Most of the logs are collected now. There are three logs that are failing for some reason but wanted to run and see if we can capture the error that is occurring during Windows e2e.

Edit: ssh has default connection of 10 so it was dropping connections. I've addressed that issue

@jsturtevant jsturtevant force-pushed the windows-e2e-logging branch 2 times, most recently from 43949db to a7a961e Compare April 30, 2021 00:45
@jsturtevant
Copy link
Contributor Author

looks like the node didn't come online at all
/test pull-cluster-api-provider-azure-e2e-windows

@jsturtevant
Copy link
Contributor Author

It looks like node didn't come online properly. Kubeproxy and flannel were both not running and kubelet looks to have crashed as well. Some of the powershell output was truncated, Will update this to fix that and add a few more logs to collect

@jsturtevant
Copy link
Contributor Author

I also found that logs are not collected for machinepools: #1352

@jsturtevant jsturtevant force-pushed the windows-e2e-logging branch from acca60f to 274add5 Compare April 30, 2021 19:34
@CecileRobertMichon
Copy link
Contributor

Looks like this still has patches from #1350 ?

@jsturtevant
Copy link
Contributor Author

jsturtevant commented Apr 30, 2021

Looks like this still has patches from #1350 ?

yes, added them back for testing.

@jsturtevant
Copy link
Contributor Author

Running Windows job again to see if restart of kubelet works

/test pull-cluster-api-provider-azure-e2e-windows

@jsturtevant
Copy link
Contributor Author

jsturtevant commented May 3, 2021

try 2 passed, try 3:

/test pull-cluster-api-provider-azure-e2e-windows

@jsturtevant
Copy link
Contributor Author

try 3 passed, try 4:

/test pull-cluster-api-provider-azure-e2e-windows

@jsturtevant
Copy link
Contributor Author

try 4 failed with nodes failing to come online, logged: #1359

try 5:

/test pull-cluster-api-provider-azure-e2e-windows

@jsturtevant
Copy link
Contributor Author

try 5 passed, try 6:

/test pull-cluster-api-provider-azure-e2e-windows

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 4, 2021
@jsturtevant jsturtevant force-pushed the windows-e2e-logging branch from ab17555 to c9a8a75 Compare May 4, 2021 19:56
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 4, 2021
@jsturtevant
Copy link
Contributor Author

try 6 passed, I've re-based to remove the patches from #1350

@CecileRobertMichon
Copy link
Contributor

If this is actually fixing stuff (besides logging), let's add that in the pr description / title

@jsturtevant jsturtevant mentioned this pull request May 4, 2021
3 tasks
@jsturtevant jsturtevant force-pushed the windows-e2e-logging branch from c9a8a75 to aae71ff Compare May 5, 2021 02:24
@jsturtevant
Copy link
Contributor Author

If this is actually fixing stuff (besides logging), let's add that in the pr description / title

Split out the fix to #1366

@jsturtevant jsturtevant force-pushed the windows-e2e-logging branch from aae71ff to a4d8526 Compare May 5, 2021 21:45
@jsturtevant
Copy link
Contributor Author

/assign @marosset

@jsturtevant
Copy link
Contributor Author

Copy link
Member

@chewong chewong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits, otherwise LGTM

test/e2e/azure_logcollector.go Outdated Show resolved Hide resolved
scripts/ci-e2e.sh Show resolved Hide resolved
test/e2e/azure_logcollector.go Outdated Show resolved Hide resolved
test/e2e/azure_logcollector.go Outdated Show resolved Hide resolved
test/e2e/azure_logcollector.go Outdated Show resolved Hide resolved
Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little fix up for logging.

Outside of that, it would be super cool to see the log collection functionality end up in https://github.com/Azure/azure-capi-cli-extension. It would be so useful to be able to ask a user to gather logs from their cluster for debugging.

test/e2e/azure_logcollector.go Outdated Show resolved Hide resolved
@jsturtevant jsturtevant force-pushed the windows-e2e-logging branch from a4d8526 to 0e732ef Compare May 13, 2021 23:58
@nader-ziada
Copy link
Contributor

/lgtm

over to @devigned

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 14, 2021
Copy link
Contributor

@devigned devigned left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: devigned

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 14, 2021
@k8s-ci-robot k8s-ci-robot merged commit 45c79db into kubernetes-sigs:master May 14, 2021
@k8s-ci-robot k8s-ci-robot added this to the v0.5.0 milestone May 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/azure Issues or PRs related to azure provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Log dumping script for Windows clusters
7 participants