feat: add support for directory walk #29

Dentrax · 2022-06-03T20:19:10Z

Fixes #28

Signed-off-by: Furkan furkan.turkal@trendyol.com
Co-authored-by: Batuhan batuhan.apaydin@trendyol.com

Dentrax · 2022-06-03T20:20:45Z

Example output:

$ go run . check ./apko/.github/workflows

[PASS] ../apko/.github/workflows/add-issues.yml
[PASS] ../apko/.github/workflows/build.yaml
[PASS] ../apko/.github/workflows/codeql.yaml
[PASS] ../apko/.github/workflows/go-tests.yaml
[FAIL] ../apko/.github/workflows/mink-e2e.yaml : check failed: found 5 unpinned refs: ["docker/setup-qemu-action@v2.0.0" "chainguard-dev/actions/kind-diag@main" "chainguard-dev/actions/setup-kind@main" "chainguard-dev/actions/setup-mink@main" "imjasonh/setup-crane@v0.1"]
check failed: found 5 unpinned refs: ["docker/setup-qemu-action@v2.0.0" "chainguard-dev/actions/kind-diag@main" "chainguard-dev/actions/setup-kind@main" "chainguard-dev/actions/setup-mink@main" "imjasonh/setup-crane@v0.1"]
exit status 1

.gitignore

command/check.go

Dentrax · 2023-02-13T19:52:52Z

Hey @sethvargo, really sorry, I dropped the ball here. Just made some changes according to your reviews. New commit includes:

restrict to only y(a|ml) files
introduce a new func called doer to make it available in all the commands
summarize the total findings for calculating the last exit code
add test coverage

PTAL when possible. Comments have been addressed. 🤞

* ability to traverse dirs * restrict to .yml and .yaml files only * plumb in through lower in the stack * summarize the total findings for the exit code * add test case Signed-off-by: Furkan <furkan.turkal@trendyol.com> Co-authored-by: Batuhan <batuhan.apaydin@trendyol.com>

sethvargo

Hi @Dentrax - the walker implementation looks correct, but what I was trying to explain in my previous comment is that this is still incorrect. By invoking Run for each file, we could generate hundreds or thousands of upstream requests. Think if I ran this on a directory with 100 identical YAML files, each with 10 GitHub Actions references. That would generate 1000 unique calls to the GitHub API.

Furthermore, what if something changes between when we resolve file_a to file_z? It's possible that a reference could resolve differently if an upstream developer pushes a new version. This is pretty horrible, since now you have different checksums for the same reference on the same ratchet run.

Instead, I think we need to walk as you've done here, but instead of calling "Run", we need to parse all the files and de-duplicate references. Then resolve all references. Then Walk again and rewrite each time. Does that make sense?

Dentrax · 2023-02-14T06:26:35Z

we could generate hundreds or thousands of upstream requests.

Oh, good point. I obviously missed that part, ratchet actually does HTTP requests to upstream.

what if something changes between when we resolve file_a to file_z?

I think you mean there would be a slight time-window that would be ending up with reference drift between local vs upstream.

we need to parse all the files and de-duplicate references. Then resolve all references.

Could you please clarify, what does de-duplicate references mean in this context? What is my reference key? Is it refs := refsList.All() but for aggregated by traversing all files? By doing so, I would not send duplicated HTTP requests for already resolved files. Is that correct?

sethvargo · 2023-02-14T19:36:51Z

I think you mean there would be a slight time-window that would be ending up with reference drift between local vs upstream.

I mean, if we're iterating over files and making upstream HTTP calls, it's possible that the same reference (actions/checkout@v3) could resolve to different checksums as each file is parsed, if GitHub pushes a new package version while Ratchet is running. It's a real-life race condition.

Could you please clarify, what does de-duplicate references mean in this context? What is my reference key? Is it refs := refsList.All() but for aggregated by traversing all files? By doing so, I would not send duplicated HTTP requests for already resolved files. Is that correct?

I think so, yea. We would need all the references, then we would need to drop them in a map/dedup. Then resolve the references. Then write the resolutions back to each time. This might mean we need to walk the files twice, but that's probably fine.

Signed-off-by: Furkan <furkan.turkal@trendyol.com>

Dentrax · 2023-02-16T19:54:33Z

Hey @sethvargo, I have just made some changes according to your reviews and ideas. PTAL when possible. Thanks!

In walk mode, I traverse each file and cache the all references. pin and update commands resolves the all corresponding references from the cache.

I had to introduce a new FetchAndCacheReferences func to traverse and feed to cache, it seems duplicate to Pin function in the first place but couldn't find a much cleaner way.

Added necessary unit test cases to cover walking logic. Used bitwise-like cache hit counter to ensure the correctness.

Dentrax · 2023-03-25T02:34:41Z

Kindly ping @sethvargo

This is a refactor to enable Ratchet to support multiple files in a single run. All files are parsed once (to limit upstream API calls) and then updated in serial. Closes #29

Dentrax mentioned this pull request Jun 3, 2022

check: support directory scan #28

Closed

sethvargo requested changes Jun 4, 2022

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

command/check.go Outdated Show resolved Hide resolved

Dentrax force-pushed the check-dir branch 2 times, most recently from 9683495 to dda8ce6 Compare February 13, 2023 19:48

Dentrax force-pushed the check-dir branch from dda8ce6 to 4b7ad14 Compare February 13, 2023 19:55

Dentrax force-pushed the check-dir branch from 4b7ad14 to f974c16 Compare February 13, 2023 19:59

sethvargo reviewed Feb 14, 2023

View reviewed changes

cacheable references for walk

0010134

Signed-off-by: Furkan <furkan.turkal@trendyol.com>

Dentrax changed the title ~~feat: check: support directory~~ feat: add support for directory walk Feb 16, 2023

sethvargo mentioned this pull request Feb 3, 2024

Support multiple files as inputs #64

Merged

sethvargo closed this in #64 Feb 3, 2024

sethvargo added a commit that referenced this pull request Feb 3, 2024

Support multiple files as inputs (#64)

135ea47

This is a refactor to enable Ratchet to support multiple files in a single run. All files are parsed once (to limit upstream API calls) and then updated in serial. Closes #29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add support for directory walk #29

feat: add support for directory walk #29

Dentrax commented Jun 3, 2022

Dentrax commented Jun 3, 2022 •

edited

Loading

Dentrax commented Feb 13, 2023

sethvargo left a comment

Dentrax commented Feb 14, 2023

sethvargo commented Feb 14, 2023

Dentrax commented Feb 16, 2023 •

edited

Loading

Dentrax commented Mar 25, 2023

feat: add support for directory walk #29

feat: add support for directory walk #29

Conversation

Dentrax commented Jun 3, 2022

Dentrax commented Jun 3, 2022 • edited Loading

Dentrax commented Feb 13, 2023

sethvargo left a comment

Choose a reason for hiding this comment

Dentrax commented Feb 14, 2023

sethvargo commented Feb 14, 2023

Dentrax commented Feb 16, 2023 • edited Loading

Dentrax commented Mar 25, 2023

Dentrax commented Jun 3, 2022 •

edited

Loading

Dentrax commented Feb 16, 2023 •

edited

Loading