-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Permission issue with Cosmos cache in some restricted environments #1008
Comments
To resolve the Here is the modified code: def _update_partial_parse_cache(latest_partial_parse_filepath: Path, cache_dir: Path) -> None:
"""
Update the cache to have the latest partial parse file contents.
:param latest_partial_parse_filepath: Path to the most up-to-date partial parse file
:param cache_dir: Path to the Cosmos project cache directory
"""
cache_path = get_partial_parse_path(cache_dir)
manifest_path = get_partial_parse_path(cache_dir).parent / DBT_MANIFEST_FILE_NAME
latest_manifest_filepath = latest_partial_parse_filepath.parent / DBT_MANIFEST_FILE_NAME
shutil.copyfile(str(latest_partial_parse_filepath), str(cache_path))
shutil.copyfile(str(latest_manifest_filepath), str(manifest_path))
def _copy_partial_parse_to_project(partial_parse_filepath: Path, project_path: Path) -> None:
"""
Update target dbt project directory to have the latest partial parse file contents.
:param partial_parse_filepath: Path to the most up-to-date partial parse file
:param project_path: Path to the target dbt project directory
"""
target_partial_parse_file = get_partial_parse_path(project_path)
tmp_target_dir = project_path / DBT_TARGET_DIR_NAME
tmp_target_dir.mkdir(exist_ok=True)
source_manifest_filepath = partial_parse_filepath.parent / DBT_MANIFEST_FILE_NAME
target_manifest_filepath = target_partial_parse_file.parent / DBT_MANIFEST_FILE_NAME
shutil.copyfile(str(partial_parse_filepath), str(target_partial_parse_file))
patch_partial_parse_content(target_partial_parse_file, project_path)
if source_manifest_filepath.exists():
shutil.copyfile(str(source_manifest_filepath), str(target_manifest_filepath)) By using
|
I believe we are using |
Sorry for the late reply @pankajastro! If that is the intention, then
Looking at the source code of I am using the |
While that might be true, I can observe that we are comparing time in this context astronomer-cosmos/cosmos/cache.py Line 87 in 8e69a36
shutil.copystat appears more promising for preserving this metadata, but I'm curious whether it resolves your permission issue, given that it also copies permission bits? |
I ran some tests today, and you're correct that I run into a similar permission issue. Actually, even modifying the the file modification timestamp is not permitted.
I don't see how that could happen unless there is a clock mismatch between the process generating the manifest and Cosmos' caching process. That seems highly unlikely! In addition, I also tested on a "normal filesystem" and can confirm that >>> from pathlib import Path
>>> import shutil
>>>
>>> source = Path('source.txt')
>>> source.write_text("Hello, World!")
>>> dest_dir = Path("/dest")
>>> source.stat()
os.stat_result(st_mode=33188, st_ino=5667023, st_dev=2096, st_nlink=1, st_uid=1000, st_gid=1000, st_size=13, st_atime=1718698252, st_mtime=1718698198, st_ctime=1718698198)
>>> # Test 1: shutil.copy
>>> dest1 = dest_dir / "test1.txt"
>>> shutil.copy(source, dest1)
>>> dest1.stat().st_mtime
os.stat_result(st_mode=33188, st_ino=5667073, st_dev=2096, st_nlink=1, st_uid=1000, st_gid=1000, st_size=13, st_atime=1718698252, st_mtime=1718698252, st_ctime=1718698252)
>>> # Test 2: shutil.copy2
>>> dest2 = dest_dir / "test2.txt"
>>> shutil.copy2(source, dest2)
>>> dest2.stat().st_mtime
os.stat_result(st_mode=33188, st_ino=5667040, st_dev=2096, st_nlink=1, st_uid=1000, st_gid=1000, st_size=13, st_atime=1718698252, st_mtime=1718698198, st_ctime=1718698278)
>>> # Test 3: shutil.copyfile
>>> dest3 = dest_dir / "test3.txt"
>>> shutil.copyfile(source, dest3)
>>> dest3.stat()
os.stat_result(st_mode=33188, st_ino=5667062, st_dev=2096, st_nlink=1, st_uid=1000, st_gid=1000, st_size=13, st_atime=1718699621, st_mtime=1718699625, st_ctime=1718699625) The only effect of using |
## Description ~shutil.copy includes permission copying via chmod. If the user lacks permission to run chmod, a PermissionError occurs. To avoid this, we split the operation into two steps: first, copy the file contents; then, copy metadata if feasible without raising exceptions. Step 1: Copy file contents (no metadata) Step 2: Copy file metadata (permission bits and other metadata) without raising exception~ use shutil.copyfile(...) instead of shutil.copy(...) to avoid running chmod ## Related Issue(s) closes: astronomer#1008 ## Breaking Change? No ## Checklist - [ ] I have made corresponding changes to the documentation (if required) - [ ] I have added tests that prove my fix is effective or that my feature works
Cosmos caching mecanism can trigger permission errors when the user does not have permission to chmod files in the cache location.
This can happen, for example, when using an Azure Files volume with AKS/Azure container instances/Azure container apps. A known (unfortunate) limitation of this system is that you can't change a file permissions, permissions are defined at mount time: https://stackoverflow.com/questions/58301985/permissions-on-azure-file.
I get the following error when trying to use Cosmos cache with such a volume:
The problem is a consequence of using
shutil.copy
which, according to the documentation "copies the file data and the file’s permission mode", and therefore fails in this context:astronomer-cosmos/cosmos/cache.py
Lines 110 to 111 in bfca374
Using
shutil.copyfile
should solve this issue.The text was updated successfully, but these errors were encountered: