Skip to content

Commit

Permalink
Make permissions for FileTaskHandler group-writeable and configurable (
Browse files Browse the repository at this point in the history
…#29506)

File Task Handler should apply different permissions to log files
generated by Airflow in order to handle impersonation. This change
exposes mechanism to bettet control the extend of permissions granted
depending on individual preferences of the users. Default permissions
are set to "group-writeable" allowing for impersonation use case, but
it can be more relaxed or more limited by configuration.
  • Loading branch information
potiuk authored Feb 13, 2023
1 parent 41fade2 commit 18347d3
Show file tree
Hide file tree
Showing 5 changed files with 80 additions and 16 deletions.
2 changes: 1 addition & 1 deletion Dockerfile.ci
Original file line number Diff line number Diff line change
Expand Up @@ -1135,7 +1135,7 @@ ARG AIRFLOW_CONSTRAINTS_LOCATION=""
ARG DEFAULT_CONSTRAINTS_BRANCH="constraints-main"
# By changing the epoch we can force reinstalling Airflow and pip all dependencies
# It can also be overwritten manually by setting the AIRFLOW_CI_BUILD_EPOCH environment variable.
ARG AIRFLOW_CI_BUILD_EPOCH="4"
ARG AIRFLOW_CI_BUILD_EPOCH="4"0
ARG AIRFLOW_PRE_CACHED_PIP_PACKAGES="true"
# By default in the image, we are installing all providers when installing from sources
ARG INSTALL_PROVIDERS_FROM_SOURCES="true"
Expand Down
30 changes: 30 additions & 0 deletions airflow/config_templates/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -806,6 +806,36 @@ logging:
type: string
example: path.to.my_func
default: ~
file_task_handler_new_folder_permissions:
description: |
Permissions in the form or of octal string as understood by chmod. The permissions are important
when you use impersonation, when logs are written by a different user than airflow. The most secure
way of configuring it in this case is to add both users to the same group and make it the default
group of both users. Group-writeable logs are default in airflow, but you might decide that you are
OK with having the logs other-writeable, in which case you should set it to `0o777`. You might
decide to add more security if you do not use impersonation and change it to `0o755` to make it
only owner-writeable. You can also make it just readable only for owner by changing it to `0o700` if
all the access (read/write) for your logs happens from the same user.
version_added: 2.6.0
type: string
example: "0o775"
default: "0o775"
file_task_handler_new_file_permissions:
description: |
Permissions in the form or of octal string as understood by chmod. The permissions are important
when you use impersonation, when logs are written by a different user than airflow. The most secure
way of configuring it in this case is to add both users to the same group and make it the default
group of both users. Group-writeable logs are default in airflow, but you might decide that you are
OK with having the logs other-writeable, in which case you should set it to `0o666`. You might
decide to add more security if you do not use impersonation and change it to `0o644` to make it
only owner-writeable. You can also make it just readable only for owner by changing it to `0o600` if
all the access (read/write) for your logs happens from the same user.
version_added: 2.6.0
type: string
example: "0o664"
default: "0o664"


metrics:
description: |
StatsD (https://github.com/etsy/statsd) integration settings.
Expand Down
22 changes: 22 additions & 0 deletions airflow/config_templates/default_airflow.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,28 @@ trigger_log_server_port = 8794
# Example: interleave_timestamp_parser = path.to.my_func
# interleave_timestamp_parser =

# Permissions in the form or of octal string as understood by chmod. The permissions are important
# when you use impersonation, when logs are written by a different user than airflow. The most secure
# way of configuring it in this case is to add both users to the same group and make it the default
# group of both users. Group-writeable logs are default in airflow, but you might decide that you are
# OK with having the logs other-writeable, in which case you should set it to `0o777`. You might
# decide to add more security if you do not use impersonation and change it to `0o755` to make it
# only owner-writeable. You can also make it just readable only for owner by changing it to `0o700` if
# all the access (read/write) for your logs happens from the same user.
# Example: file_task_handler_new_folder_permissions = 0o775
file_task_handler_new_folder_permissions = 0o775

# Permissions in the form or of octal string as understood by chmod. The permissions are important
# when you use impersonation, when logs are written by a different user than airflow. The most secure
# way of configuring it in this case is to add both users to the same group and make it the default
# group of both users. Group-writeable logs are default in airflow, but you might decide that you are
# OK with having the logs other-writeable, in which case you should set it to `0o666`. You might
# decide to add more security if you do not use impersonation and change it to `0o644` to make it
# only owner-writeable. You can also make it just readable only for owner by changing it to `0o600` if
# all the access (read/write) for your logs happens from the same user.
# Example: file_task_handler_new_file_permissions = 0o664
file_task_handler_new_file_permissions = 0o664

[metrics]

# StatsD (https://github.com/etsy/statsd) integration settings.
Expand Down
36 changes: 21 additions & 15 deletions airflow/utils/log/file_task_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -413,7 +413,7 @@ def read(self, task_instance, try_number=None, metadata=None):

def _prepare_log_folder(self, directory: Path):
"""
Prepare the log folder and ensure its mode is 777.
Prepare the log folder and ensure its mode is as configured.
To handle log writing when tasks are impersonated, the log files need to
be writable by the user that runs the Airflow command and the user
Expand All @@ -429,24 +429,31 @@ def _prepare_log_folder(self, directory: Path):
via the UI (or vice versa) results in a permission error as the task
tries to write to a log file created by the other user.
Create the log file and give it group writable permissions
TODO(aoen): Make log dirs and logs globally readable for now since the SubDag
operator is not compatible with impersonation (e.g. if a Celery executor is used
for a SubDag operator and the SubDag operator has a different owner than the
parent DAG)
We leave it up to the user to manage their permissions by exposing configuration for both
new folders and new log files. Default is to make new log folders and files group-writeable
to handle most common impersonation use cases. The requirement in this case will be to make
sure that the same group is set as default group for both - impersonated user and main airflow
user.
"""
mode = 0o777
directory.mkdir(mode=mode, parents=True, exist_ok=True)
if directory.stat().st_mode != mode:
directory.chmod(mode)
new_folder_permissions = int(
conf.get("logging", "file_task_handler_new_folder_permissions", fallback="0o775"), 8
)
directory.mkdir(mode=new_folder_permissions, parents=True, exist_ok=True)
if directory.stat().st_mode != new_folder_permissions:
print(f"Changing dir permission to {new_folder_permissions}")
directory.chmod(new_folder_permissions)

def _init_file(self, ti):
"""
Create log directory and give it correct permissions.
Create log directory and give it permissions that are configured. See above _prepare_log_folder
method for more detailed explanation.
:param ti: task instance object
:return: relative log path of the given task instance
"""
new_file_permissions = int(
conf.get("logging", "file_task_handler_new_file_permissions", fallback="0o664"), 8
)
local_relative_path = self._render_filename(ti, ti.try_number)
full_path = os.path.join(self.local_base, local_relative_path)
if ti.is_trigger_log_context is True:
Expand All @@ -457,11 +464,10 @@ def _init_file(self, ti):

if not os.path.exists(full_path):
open(full_path, "a").close()
# TODO: Investigate using 444 instead of 666.
try:
os.chmod(full_path, 0o666)
except OSError:
logging.warning("OSError while change ownership of the log file")
os.chmod(full_path, new_file_permissions)
except OSError as e:
logging.warning("OSError while changing ownership of the log file. ", e)

return full_path

Expand Down
6 changes: 6 additions & 0 deletions newsfragments/29506.significant.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Default permissions of file task handler log directories and files has been changed to "owner + group" writeable.

Default setting handles case where impersonation is needed and both users (airflow and the impersonated user)
have the same group set as main group. Previously the default was also other-writeable and the user might choose
to use the other-writeable setting if they wish by configuring ``file_task_handler_new_folder_permissions``
and ``file_task_handler_new_file_permissions`` in ``logging`` section.

0 comments on commit 18347d3

Please sign in to comment.