-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for directory walk #29
Conversation
Example output:
|
9683495
to
dda8ce6
Compare
Hey @sethvargo, really sorry, I dropped the ball here. Just made some changes according to your reviews. New commit includes:
PTAL when possible. Comments have been addressed. 🤞 |
* ability to traverse dirs * restrict to .yml and .yaml files only * plumb in through lower in the stack * summarize the total findings for the exit code * add test case Signed-off-by: Furkan <furkan.turkal@trendyol.com> Co-authored-by: Batuhan <batuhan.apaydin@trendyol.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Dentrax - the walker implementation looks correct, but what I was trying to explain in my previous comment is that this is still incorrect. By invoking Run
for each file, we could generate hundreds or thousands of upstream requests. Think if I ran this on a directory with 100 identical YAML files, each with 10 GitHub Actions references. That would generate 1000 unique calls to the GitHub API.
Furthermore, what if something changes between when we resolve file_a to file_z? It's possible that a reference could resolve differently if an upstream developer pushes a new version. This is pretty horrible, since now you have different checksums for the same reference on the same ratchet run.
Instead, I think we need to walk as you've done here, but instead of calling "Run", we need to parse all the files and de-duplicate references. Then resolve all references. Then Walk again and rewrite each time. Does that make sense?
Oh, good point. I obviously missed that part, ratchet actually does HTTP requests to upstream.
I think you mean there would be a slight time-window that would be ending up with reference drift between local vs upstream.
Could you please clarify, what does |
I mean, if we're iterating over files and making upstream HTTP calls, it's possible that the same reference (
I think so, yea. We would need all the references, then we would need to drop them in a map/dedup. Then resolve the references. Then write the resolutions back to each time. This might mean we need to walk the files twice, but that's probably fine. |
Signed-off-by: Furkan <furkan.turkal@trendyol.com>
Hey @sethvargo, I have just made some changes according to your reviews and ideas. PTAL when possible. Thanks! In walk mode, I traverse each file and cache the all references. I had to introduce a new Added necessary unit test cases to cover walking logic. Used bitwise-like cache hit counter to ensure the correctness. |
Kindly ping @sethvargo |
This is a refactor to enable Ratchet to support multiple files in a single run. All files are parsed once (to limit upstream API calls) and then updated in serial. Closes #29
Fixes #28
Signed-off-by: Furkan furkan.turkal@trendyol.com
Co-authored-by: Batuhan batuhan.apaydin@trendyol.com