-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only allow webserver to request from the worker log server #16754
Conversation
Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
All passed bar upload coverage -- merging. |
Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9)
) Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9) (cherry picked from commit d4a09f25a186223c104f7480d78c599b14481746)
) Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9) (cherry picked from commit d4a09f25a186223c104f7480d78c599b14481746)
) Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9) (cherry picked from commit d4a09f25a186223c104f7480d78c599b14481746)
) Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9) (cherry picked from commit d4a09f25a186223c104f7480d78c599b14481746)
) Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9) (cherry picked from commit d4a09f25a186223c104f7480d78c599b14481746)
) Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9) (cherry picked from commit d4a09f25a186223c104f7480d78c599b14481746)
Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9)
) Logs _shouldn't_ contain any sensitive info, but they often do by mistake. As an extra level of protection we shouldn't allow anything other than the webserver to access the logs. (We can't change the bind IP form 0.0.0.0 as for it to be useful it needs to be accessed from different hosts -- i.e. the webserver will almost always be on a different node) (cherry picked from commit 2285ee9)
@ashb this could have broken worker logs access from the web ui. A pod running webui cannot access logs from the worker in another pod. This can be reproduced if deployed in Kubernetes. Getting the following:
|
This push has introduced another bug of 403 client error when DAG logs are fetched from Airflow WebUI. This bug appears to impact when Airflow is setup in cluster mode. We have setup Airflow in cluster mode with Celery - One master and multiple worker nodes. [Not Celery kubernetes] After increasing the default fallback value to 120 seconds in below code snippet, this has fixed the 403 client error issue. def create_app(): Code: https://github.com/apache/airflow/blob/main/airflow/utils/serve_logs.py |
This is not the real propblem but your deployment issue. Your webserver and workers simply do not have time properly synchronized. Please make sure they do (for example with ntp). |
Initially I did think the same and made sure they are synced which did not help. They are having only 2 second delay between the nodes. |
I actually think it's not the change you applied mto the code but likely redeploying the code or restarting the machine fixed the issue. Almost 100% sure that you had time drift. The fact that you have 2 seconds now, means that something is wrong and probably the difference will grow. When you have NTP synchronised, both machines should have exactly the same time down to millisecond. You can check it be changing the value back to 30 - as long as the time drift is smaller it should still work. |
Airflow library is installed on a separate python virtual environment with required Airflow python libraries installed there.. we updated the serve_logs.py in site-packages directory of that virtual environment which is picked up by Airflow process to resolve the issue..Machine reboot was not performed. All the machines are synced with same RHEL NTP servers. This impacted multiple deployments of Airflow..The other deployment did not have any time lag/drift.. What is the significance of log_request_clock_grace parameter in the code? Can this be configured in airflow.cfg? |
So likely you had another issue - possibly not restarted airlfow that used different configuration. Yes. This parameter should change grace period and yes it should work if you restart airflow. so you should look in your process/deployment not in airflow. |
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Some users seem to be using different configs for different components. This doc makes it clear on what we recommend. Context: apache/airflow#16754 (comment) >I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs. GitOrigin-RevId: 3a263ab5b0b8b8b796d96f060e6ccd4a0e379bf3
After apache/airflow#16754 -- it is important that both Webserver and Worker have the same config value for `[webserver] secret_key` or else you will see the following error: ``` *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log For more information check: https://httpstatuses.com/403 ``` This happens because Airflow generates a random value for them if value isn't provided, which causes a random string generated on webserver and worker. Hence they don't match, resulting in the error. This PR creates a K8s Secret object and creates a key for that setting and pass it as Env Var similar to what we do with Fernet Key. GitOrigin-RevId: 7842de0ff124dd6a2696ec82bf6423455164df5b
Logs shouldn't contain any sensitive info, but they often do by
mistake. As an extra level of protection we shouldn't allow anything
other than the webserver to access the logs.
(We can't change the bind IP form 0.0.0.0 as for it to be useful it
needs to be accessed from different hosts -- i.e. the webserver will
almost always be on a different node)
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.