Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Periscope to run on Windows nodes #167

Merged
merged 5 commits into from
Apr 14, 2022

Conversation

peterbom
Copy link
Contributor

@peterbom peterbom commented Apr 11, 2022

(Finally) addresses #10. Some notes:

  • The publishing pipelines now create two images, 'aks/periscope' and 'aks/periscope-win'.
  • This will require another MCR webhook (https://github.com/microsoft/mcr/pull/1833), which will need to be merged either:
    • before this PR; or
    • before the Build and Publish to MCR pipeline is next run.
  • A new DaemonSet is required, because the nodeSelector and volumes property are at the DaemonSet level, not container level.
  • There is more potential work to be done for Windows-specific diagnostics, see Add Windows Support #10, but this at least allows most collectors to run.
  • The main Periscope code change is to add a CheckSupported method to all collectors. The initial purpose of this was to check the OS type, but is intentionally generic to allow checking any contextual information (e.g. environment variables).
  • Anything that uses RunCommandOnHost cannot be run on Windows (Windows does not support privileged mode).
  • Code that used cat to get file contents also could not run on Windows.
  • Getting the host name (which is essential for Periscope functionality) involved both RunCommandOnHost and cat, and so required a completely different approach.
  • Anything else that used cat was replaced by host volume mapping and standard OS file reading (the volume mappings were already set up, so the cats were unnecessary anyway).
  • For all collectors that are not supported on Windows, the reasons why are commented in the CheckSupported method.
  • A lot of work went into making the code coverage checker happy and generally trying to make the code more testable.
    • Since ideally tests should run on both Linux and Windows, it is hard to test code that behaves conditionally based on runtime.GOOS.
    • This led to injecting a RuntimeInfo type into collectors, which would contain the OS identifier.
    • This led to putting all environment variables into RuntimeInfo, making it easier to see what variables are being used, and removing the need to set env vars in test classes.
    • It was hard work figuring out what file paths and volume mappings were used, and adding extra Windows paths made this problem worse, so a KnownFilePaths type was added for carrying these around. As well as making file system usage more obvious and explicit, this allows a single type to be injected into collectors for more flexible testing.
    • Also due to differences between OS, tests that relied on hard-coded filesystem paths (e.g. the node logs collector) became inconsistent (there was already a TODO to that effect).
    • This led to injecting a FileContentReader interface into collectors, which could be substituted for a fake in-memory file content map.
    • The FileContentReader implementation is now moved out of helper.go into its own file, with unit tests. Unfortunately this made the coverage checker include the utils package, dramatically lowering overall test coverage, so I had to add an override to exclude it again.
    • Some of the logic in the main aks-periscope.go file involved conditionally adding collectors depending on environment variables. This kind of checking was a good fit for the CheckSupported method of the collectors themselves, and was moved there instead. This means (a) the main file is simplified, and (b) this logic is now completely covered by unit tests.

IMG_20220413_162456

@peterbom peterbom requested a review from Tatsinnit April 11, 2022 23:57
@codecov-commenter
Copy link

codecov-commenter commented Apr 12, 2022

Codecov Report

Merging #167 (a34088c) into master (ce9162c) will increase coverage by 5.25%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #167      +/-   ##
==========================================
+ Coverage   69.25%   74.50%   +5.25%     
==========================================
  Files          11       11              
  Lines         348      404      +56     
==========================================
+ Hits          241      301      +60     
+ Misses         85       82       -3     
+ Partials       22       21       -1     
Impacted Files Coverage Δ
pkg/collector/dns_collector.go 100.00% <100.00%> (+11.76%) ⬆️
pkg/collector/helm_collector.go 82.35% <100.00%> (+1.91%) ⬆️
pkg/collector/iptables_collector.go 80.00% <100.00%> (+10.76%) ⬆️
pkg/collector/kubeletcmd_collector.go 80.00% <100.00%> (+10.76%) ⬆️
pkg/collector/kubeobjects_collector.go 81.25% <100.00%> (+1.25%) ⬆️
pkg/collector/networkoutbound_collector.go 96.00% <100.00%> (+0.16%) ⬆️
pkg/collector/nodelogs_collector.go 91.30% <100.00%> (+2.41%) ⬆️
pkg/collector/pdb_collector.go 60.52% <100.00%> (+5.98%) ⬆️
pkg/collector/pods_containerlogs_collector.go 82.35% <100.00%> (+1.10%) ⬆️
pkg/collector/systemlogs_collector.go 81.81% <100.00%> (+8.48%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ce9162c...a34088c. Read the comment docs.

@Tatsinnit Tatsinnit added the enhancement 🏎 New feature or request label Apr 12, 2022
@peterbom peterbom linked an issue Apr 12, 2022 that may be closed by this pull request
@Tatsinnit
Copy link
Member

👍✅❤️🎉 Thank for this!! given an amount of thinking you have done around this Pete, I feel like you know more then what I can nit-pick and since size is decent, I can see that you have used client-go as baseline mechanism to implement underlying logic.

thank you so much for the diagram as well. I think you have plus one from my end for this sizeable PR. Just for sanity I might give another quick eyeballing tomorrow.

Thank you!

@Tatsinnit Tatsinnit merged commit 99a1944 into Azure:master Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🏎 New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Windows Support
3 participants