-
Notifications
You must be signed in to change notification settings - Fork 419
Closed
Description
What happened:
Using filecache with check_files=True
Discovered using the abfs filesystem and replicated with the local filesystem.
- First call: file downloads correcly to cache.
- File is updated in Azure
- Second call:
CachingFileSystem._check_filecorrectly identifies there is a new version withdetail["uid"] != self.fs.ukey(path) - File is downloaded to cache folder
- cache file is rewritten by ``CachingFileSystem.save_cache" however the uid is not updated
WholeFileCacheFileSystem._opencallsreturn self._open(path, mode)kicking off the loop again and repeatedly failing the check in step 3
What you expected to happen:
cache file update includes saving the new uid
Minimal Complete Verifiable Example:
The following prints only Hello, then blows the stack
import fsspec
tf = open("testfile.txt", "w")
tf.write('Hello\n')
tf.flush()
fs = fsspec.filesystem("filecache", target_protocol="file", check_files=True)
with fs.open("file://testfile.txt") as fsfile:
print(fsfile.read())
tf.write('World\n')
tf.flush()
with fs.open("file://testfile.txt") as fsfile:
print(fsfile.read())
tf.close()Anything else we need to know?:
Adding
c["uid"] = cache[k]["uid"]
below
c["time"] = max(c["time"], cache[k]["time"])
in CachingFileSystem.save_cache appears to solve the problem, however I do not know enough about fsspec to know if it introduces other problems
Environment:
- fsspec version: 2021.7.0
- Python version: 3.8.8
- Operating System: Windows 10
- Install method (conda, pip, source): conda
Metadata
Metadata
Assignees
Labels
No labels