Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(support-bundle): add runHostCollectorsInPod in spec #1608

Merged
merged 20 commits into from
Sep 20, 2024

Conversation

DexterYan
Copy link
Member

@DexterYan DexterYan commented Sep 9, 2024

Description, Motivation and Context

  • Add runHostCollectorsInPod for SupportBundle yaml
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: sb
  runHostCollectorsInPod: true  # default is false 
spec:
  hostCollectors:
  - hostOS: {}
  • Re-use current CollectRemote function to host collect
type HostCollector interface {
	Title() string
	IsExcluded() (bool, error)
	Collect(progressChan chan<- interface{}) (map[string][]byte, error)
	RemoteCollect(progressChan chan<- interface{}) (map[string][]byte, error)
	HasRemoted() bool
}
  • Copy CollectRemote into common go for avoiding import cycle
  • Support multiple node HostOS collector result but save only one for now
  • Enable HostOS collector for running a privileged pod

sc-111270

Test Yaml

apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
  name: sb
  runHostCollectorsInPod: true  # default is false 
spec:
  hostCollectors:
  - hostOS: {}
  hostAnalyzers:
  - hostOS: {}

Result
support-bundle-2024-09-12T16_02_26.tar.gz

Checklist

  • New and existing tests pass locally with introduced changes.
  • Tests for the changes have been added (for bug fixes / features)
  • The commit message(s) are informative and highlight any breaking changes
  • Any documentation required has been added/updated. For changes to https://troubleshoot.sh/ create a PR here

Does this PR introduce a breaking change?

  • Yes
  • No

This PR is backward compatible.

@DexterYan DexterYan requested a review from a team as a code owner September 9, 2024 06:12
@DexterYan DexterYan marked this pull request as draft September 9, 2024 06:13
@DexterYan DexterYan added the type::feature New feature or request label Sep 9, 2024
@DexterYan DexterYan changed the title WIP feat(support-bundle): add runHostCollectorsInPod in spec feat(support-bundle): add runHostCollectorsInPod in spec Sep 12, 2024
@DexterYan DexterYan marked this pull request as ready for review September 12, 2024 06:13
@@ -8,6 +8,8 @@ type HostCollector interface {
Title() string
IsExcluded() (bool, error)
Collect(progressChan chan<- interface{}) (map[string][]byte, error)
RemoteCollect(progressChan chan<- interface{}) (map[string][]byte, error)
HasRemoted() bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for? The naming has me a bit confused.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i see now. I wonder if this could be renamed to something that would make it clearer what this is for.

@@ -183,6 +183,7 @@ func runTroubleshoot(v *viper.Viper, args []string) error {
OutputPath: v.GetString("output"),
Redact: v.GetBool("redact"),
FromCLI: true,
RunHostCollectorsInPod: false || mainBundle.Metadata.RunHostCollectorsInPod,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be set to mainBundle.Metadata.RunHostCollectorsInPod?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, it has been set to mainBundle.Metadata.RunHostCollectorsInPod

for _, sb := range kinds.SupportBundlesV1Beta2 {
sb := sb
mainBundle = supportbundle.ConcatSpec(mainBundle, &sb)
enableRunHostCollectorsInPod = enableRunHostCollectorsInPod || sb.Metadata.RunHostCollectorsInPod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this cause a panic if sb.Metadata was nil for some reason? Should we check that it isn't nil here and then just set enableRunHostCollecorsInPod := sb.Metadata.RunHostCollectorsInPod?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should check that. I have added the checking part

Copy link
Member

@diamonwiggins diamonwiggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Also a couple questions:

  1. Should we refactor things in this PR so that we move the remote collector functions to internal and then have both Collectors and RemoteCollectors use those? That way we're in a better place to deprecate RemoteCollectors soon? Or would you rather just do that in a follow up?

  2. I don't see any of the code for actually making the pod that's used as the runner privileged. Is that going to be in a follow up as well?

@DexterYan DexterYan force-pushed the dx/sc-111270/host-os-collector branch from dfd63fa to 55e6760 Compare September 17, 2024 19:23
@DexterYan
Copy link
Member Author

The runner privileged has been added into the current code. For moving things to internal will be another PR

@@ -8,6 +8,8 @@ type HostCollector interface {
Title() string
IsExcluded() (bool, error)
Collect(progressChan chan<- interface{}) (map[string][]byte, error)
RemoteCollect(progressChan chan<- interface{}) (map[string][]byte, error) // RemoteCollect is used to priviledge pods to collect data from different nodes
HasRemoted() bool // HasRemoted returns true if the collector has a remote collector implementation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if HasRemoteCollector would be more descriptive here. Just nitpicking mostly, will leave it up to you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this function is not easy to read. I have removed it. And use

if err == collect.ErrRemoteCollectorNotImplemented {

to check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to use

if errors.Is(err, collect.ErrRemoteCollectorNotImplemented) {

diamonwiggins
diamonwiggins previously approved these changes Sep 18, 2024
Copy link
Member

@diamonwiggins diamonwiggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, but otherwise lgtm

@@ -18,6 +19,9 @@ type HostOSInfo struct {

const HostOSInfoPath = `host-collectors/system/hostos_info.json`

type NodeInfo struct {
HostOSInfo HostOSInfo `json:"host-collectors/system/hostos_info.json"`
Copy link
Member

@banjoh banjoh Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is json:"host-collectors/system/hostos_info.json" used? Its an interesting use of json tags where the tag is a path. I've not seen before

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, it has been removed.

Copy link
Member

@banjoh banjoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks good. I requested a few changes

In a future PR we want to start using "replicated/troubleshoot:<version>" instead of "replicated/troubleshoot:latest" so as to ensure the same version of the image is used as by the binary calling it. There will be an issue with tests passing since the image will most likely not be available.

return nil, errors.Wrap(err, "failed to convert kube flags to rest config")
}

progressCh := make(chan interface{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using progressChan channel for sending progress feedback, instead of creating a new one? The new one is also not doing anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, that has been fixed

return nil, errors.New("not implemented")
}

func (c *CollectHostCertificate) HasRemoted() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only purpose of HasRemote() is to check that a collector has implemented the RemoteCollect() function, I'd leave the function out in favour or checking the not implemented error. It would be good to keep public functions at a minimum

You could do something like

var ErrNotImplemented = errors.New("not implemented")
...
result, err := collector.RemoteCollect(opts.ProgressChan)
if errors.Is(err, collect.ErrNotImplemented) {
	result, err = collector.Collect(opts.ProgressChan)
	...
}

if err != nil {
...
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, I have changed to

if errors.Is(err, collect.ErrRemoteCollectorNotImplemented) {

@@ -20,7 +20,7 @@ type HostOSInfo struct {
const HostOSInfoPath = `host-collectors/system/hostos_info.json`

type NodeInfo struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot see where this is used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Member

@banjoh banjoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an e2e test for this use case?

@DexterYan
Copy link
Member Author

e2e test added

banjoh
banjoh previously approved these changes Sep 20, 2024
supportbundleName := "host-os-remote-collector"
tarPath := fmt.Sprintf("%s.tar.gz", supportbundleName)
targetFolder := fmt.Sprintf("%s/host-collectors/system/", supportbundleName)
cmd := exec.CommandContext(context.Background(), sbBinary(), "spec/hostOSRemoteCollector.yaml", "--interactive=false", fmt.Sprintf("-o=%s", supportbundleName))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relace context.Background() with ctx parameter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

banjoh
banjoh previously approved these changes Sep 20, 2024
@DexterYan DexterYan merged commit e97b961 into main Sep 20, 2024
27 checks passed
@DexterYan DexterYan deleted the dx/sc-111270/host-os-collector branch September 20, 2024 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type::feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants