-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parameter to /vcf endpoint to force async writes to complete before returning response #75
Conversation
src/anyvar/storage/snowflake.py
Outdated
with self.batch_thread.cond: | ||
if ( | ||
len( | ||
list(filter(lambda x: x is batch, self.batch_thread.pending_batch_list)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checking that is
is appropriate here -- the object doesn't get copied or moved around while being held on the queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this should be a strict identity check. The empty batch is essentially a sentinel object and batch objects are not modified as they move through the queue.
src/anyvar/storage/postgres.py
Outdated
@@ -127,6 +127,10 @@ def __delitem__(self, name: str) -> None: | |||
cur.execute("DELETE FROM vrs_objects WHERE vrs_id = %s;", [name]) | |||
self.conn.commit() | |||
|
|||
def wait_for_writes(self): | |||
"""Return true once any currently pending database modifications have been completed.""" | |||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the pass is necessary. Is there an existing issue to do this for postgres backend?
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we shouldn't worry about implementing it PG-side until there's a use case/demand for doing something like queueing. I think there's a world in which it could be preferable to retain the PG implementation as a more streamlined read/write option.
That said, it might be good to note in the docstring that this method is just a stub to retain compatibility with the base class, but shouldn't do anything unless we decide to implement queueing logic for the postgres writer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I don't think it should be an abstractmethod in the _Storage
class and should only exist in the snowflake backend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the wait_for_writes()
method should remain on _Storage
because it is called from anyvar.restapi.main
which does not understand which backend it is using. The REST API method implementations only calls methods on the _Storage
base class. Without it on the base class, calling /vcf
with allow_async_write=no
results in a 500 error if the backend store in Postgres.
I cleaned up the comments and removed the pass
line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ehclark that makes sense. Thanks
src/anyvar/storage/snowflake.py
Outdated
@@ -202,6 +202,30 @@ def __delitem__(self, name: str) -> None: | |||
cur.execute(f"DELETE FROM {self.table_name} WHERE vrs_id = ?;", [name]) # nosec B608 | |||
self.conn.commit() | |||
|
|||
def wait_for_writes(self): | |||
"""Return true once any currently pending database modifications have been completed.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something, but I don't think this method returns true
Co-authored-by: Kori Kuzma <korikuzma@gmail.com>
@jsstevenson Could you do another review when you get a chance. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 lgtm
The new Snowflake object store writes VRS objects to the database asynchronously to allow better response times. However, since searches and fetches only read committed database state, this creates the opportunity for reads to not reflect the most recent VRS objects.
The current default behavior, and default behavior moving forward, is for all writes to complete before the response is returned to the client. To enable writes to complete asynchronously after the response has been returned, clients can add the
allow_async_writes=yes
parameter to the request.See also #65