-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDK: allow repeated cache file removals & cleanup cache files in test #19533
Conversation
@@ -1,4 +1,6 @@ | |||
# Changelog | |||
## 0.9.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.9.5 since there is another PR claiming 0.9.4 already
@@ -68,11 +68,9 @@ def clear_cache(self): | |||
""" | |||
remove cache file only once | |||
""" | |||
STREAM_CACHE_FILES = globals().setdefault("STREAM_CACHE_FILES", set()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@grubberr tagging you since you had the git blame: why did this block only allow a single removal of the cache file? the current change is passing unit tests. Is that an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sherifnada oops I have answered on below
STREAM_CACHE_FILES.add(self.cache_filename) | ||
with suppress(FileNotFoundError): | ||
os.remove(self.cache_filename) | ||
print(f"Removed {self.cache_filename}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be dangerous, for example:
- We create
Teams
stream, on start it removes cache file and create it -file-inode-1
- We create
TeamMembers(parent=Teams)
, on start it again removes cache file and create itfile-inode-2
Teams
assume it has access tofile-inode-1
but in realy it's alreadyfile-inode-2
I definitely remember there were some side-effects if we removed file not once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We start to catch sqlite3 runtime exceptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sherifnada let me cover that runtime side-effects with unit_tests today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sherifnada example of problem from source-github
:
File "/home/user/airbyte/airbyte-integrations/connectors/source-github/.venv/lib/python3.9/site-packages/requests_cache/backends/base.py", line 100, in save_response
self.responses[cache_key] = cached_response
File "/home/user/airbyte/airbyte-integrations/connectors/source-github/.venv/lib/python3.9/site-packages/requests_cache/backends/sqlite.py", line 268, in __setitem__
super().__setitem__(key, serialized_value)
File "/home/user/airbyte/airbyte-integrations/connectors/source-github/.venv/lib/python3.9/site-packages/requests_cache/backends/sqlite.py", line 220, in __setitem__
con.execute(
sqlite3.OperationalError: attempt to write a readonly database
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have dived into request_cache internals and it seems it happens something like this under the hood:
#!/home/user/airbyte/airbyte-integrations/connectors/source-github/.venv/bin/python3
import os
import sqlite3
try:
os.unlink("cache.sqlite")
except FileNotFoundError:
pass
con = sqlite3.connect("cache.sqlite")
row = con.execute('CREATE TABLE t (col VARCHAR)')
con.commit()
os.unlink("cache.sqlite") # <- ATTENTION HERE
con.execute("INSERT INTO t (col) VALUES ('value')");
Traceback (most recent call last):
File "/home/user/airbyte/airbyte-integrations/connectors/source-github/./t.py", line 28, in <module>
con.execute("INSERT INTO t (col) VALUES ('value')");
sqlite3.OperationalError: attempt to write a readonly database
fixed with #30719 |
What
There was a bunch of cache files generated in unit tests which polluted the git workspace:
This PR cleans them up.
In the process, it removes a restriction (which I'm not sure why it existed) that only allowed a single removal of a stream's cache file during the lifetime of the Python runtime