You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To make coordination easier the SQS message sent to the TODO queue should simply include an id and any options for controlling whisper, and not the list of files:
If there are no files in the S3 bucket at s3://speech-to-text/media/abc123/ then an error message will be included when the job is put into the DONE queue.
The text was updated successfully, but these errors were encountered:
after some mostly inconclusive discussion between @edsu, @peetucket, and i the last couple days on whether to go forward with this or not, we think we've decided to close it for now?
in favor of closing:
no need to rework the file list logic that's already implemented
if file list logic changes, it'll be in common-accessioning, which more of the team is familiar with, and so it should be easier for more folks to deal with bugs or feature requests in common-accessioning than in the speech-to-text python code.
more stuff explicitly stated in job messages that we can look at, which might make debugging easier if it looks like the wrong files are getting processed
related, didn't come up in discussion, but i just realized: if something doesn't get written to the bucket for processing, but should've and is included in the file list, we'll get an error instead of a silent skip. but i suspect that we'd get a loud error if we encountered any sort of typical upload failure to S3, since that's what we've seen in e.g. preservation, so this point may actually be moot 🤷
in favor of keeping open:
simpler job messages (but we don't expect to have huge file lists for STTing for any one object, so not sure this was a practical advantage)
but no one seemed to feel strongly on any of the above reasons, and this isn't a huge change, so it's easy to reopen and run with it if we later think of something compelling that hasn't occurred to us yet.
Blocked by #3
To make coordination easier the SQS message sent to the TODO queue should simply include an
id
and anyoptions
for controlling whisper, and not the list of files:If there are no files in the S3 bucket at
s3://speech-to-text/media/abc123/
then an error message will be included when the job is put into the DONE queue.The text was updated successfully, but these errors were encountered: