-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
6 scheduling observer extraction #38
Conversation
…logic for observer in a RecourceWatcher class; added method to stop a thread gracefully
…source watcher instance to enable_joblogs to subscribe to the event watcher if the log feature is configured; delete logic about event watcher from main; pass container for list objects function instead of container name; remove start methon from log handler class; modify joblogs init to subscribe to event watcher
…nnect loop for event watcher; make number of reconnect attempts, backoff time and a coefficient for exponential growth configurable via config; add backoff_time, reconnection_attempts and backoff_coefficient as attributes to the resource watcher init; add resource_version as a param to w.stream so a failed stream can read from the last resource it was able to catch; add urllib3.exceptions.ProtocolError and handle reconnection after some exponential backoff time to avoid api flooding; add config as a param for init for resource watcher; modify config in kubernetes.yaml and k8s config to contain add backoff_time, reconnection_attempts and backoff_coefficient
… connection to the k8s was achieved so only sequential failures detected; add exception handling to watch_pods to handle failure in urllib3, when source version is old and not available anymore, and when stream is ended; remove k8s resource watcher initialization from run function in api.py and move it to k8s.py launcher as _init_resource_watcher; refactor existing logic from joblogs/__init__.py to keep it in _init_resource_watcher and enable_joblogs in k8s launcher
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Did a first review pass. Some small (I think) questions.
Very nice, seems well handled.
Would you be able to explain a bit more how the resource_version check works? I'm not familiar with it. If this is "common k8s knowledge", then I need to RTFM, otherwise a comment block in the code would be useful.
kubernetes.yaml
Outdated
|
||
max_proc = 2 | ||
reconnection_attempts = 5 | ||
backoff_time = 5 | ||
backoff_coefficient = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have sensible defaults, and not have to think about this yet?
Then we can add a more detailed configuration documentation to document this. I think ideally, no one would need to touch these (unless you have very different network requirements) - I'd like the default to work for most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a section in the README would make sense here. Though might it be time to create a new file explaining the configuration options in more detail? That could give some more freedom to explain them in more detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are defaults in the code, but they need to be "tested" to see if it's enough for our prod cluster, I suspect that it might be possible that we need to increase the number of attempts, but I tried to make it resilient in a sense that every time the connection was successfully re-established, the number of attempts sets to provided default number, so we really catch the cases when we have several connection breaks in a row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I do agree that a section about the config file is needed. How do you see this new file, just CONFIG.md or something else as part of the repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, CONFIG.md
sounds fine 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super, thanks! Pending:
- Remove the variables from the sample config files (defaults are meant to be ok, and are documented)
- Move config info from README.md to CONFIG.md
- Make sure CONFIG.md is linked to from README.md at a sensible place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the feedback, done!
scrapyd_k8s.sample-k8s.conf
Outdated
# Number of attempts to reconnect with k8s API to watch events, default is 5 | ||
reconnection_attempts = 5 | ||
|
||
# Minimum time in seconds to wait before reconnecting to k8s API to watch events, default is 5 | ||
backoff_time = 5 | ||
|
||
# Coefficient that is multiplied by backoff_time to provide exponential backoff to prevent k8s API from being overwhelmed | ||
# default is 2, every reconnection attempt will take backoff_time*backoff_coefficient | ||
backoff_coefficient = 2 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(see previous comment)
p.s. thank you so much for seperating this PR off #36 🙏 |
About resource versions. If you want to just look at the docs, it is described over here: Otherwise, I will explain a bit if it's ok. I hope my explanation is not too lame:D |
…ed to re-establish connection to the Kubernetes wather
…k to the CONFIG.md in the README.md; remove variables for reconnection_attempts, backoff_time and backoff_coefficient fron the sample config since default values are provided in the code.
@@ -87,6 +87,8 @@ data: | |||
launcher = scrapyd_k8s.launcher.K8s | |||
|
|||
namespace = default | |||
|
|||
max_proc = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is max_proc
here 2, and in scrapyd_k8s.sample-k8s.conf
10?
Thanks! I'll merge and start testing, then we can tie up any loose doc ends as the project progresses. |
This PR was extracted from the 6-scheduling branch to encapsulate logic for changes in resource watcher and log handler. A couple of changes was added to make watcher more resilient, also the comment about cleaning api.py from k8s resource watcher was taken into account.
What's new?