-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tasks that fail due to file not found error stay active and block transfers of the same file #1056
Comments
Hi John, Taking a look at the logs for your task If in the future you don't want tasks to continue retrying files that don't exist the Best, |
Thanks for the quick response. That was a steps to reproduce, not a log, hence the adding of the file was deliberate. Regarding retrying, I disagree that this is what one would expect. That's not usual behaviour for file transfer protocols and it does not appear to me to be documented anywhere in the Does the Since this is apparently intended behavior, the error message in the case of a second transfer with identical paths could be improved to clarify that the user needs to cancel the existing task. Preferably giving the ID of the interfering task, if possible. Lastly, I don't think that the behavior of the second transfer erroring is desirable, since as I say this is counterintuitive compared to standard file systems/transfer protocols (e.g. in Windows or Linux, if I try to copy a file that doesn't exist, create that file, then try to copy it again, it goes through just fine) and also causes scripts to fail in unexpected ways (i.e. if they rely on the |
Looks like there is room for improvement on this topic in the docs for the underlying Transfer service as well otherwise I would link you there. I'll see if we can get those improved and include a link in the CLI docs for
Such a task will complete as a success and no longer be active. As for the duplicate task error behavior itself, the error was added to avoid bad behavior when users set up recurring transfers that sometimes take longer to complete than the interval between submissions. Such users generally want to allow the first task complete and then wait for the next interval to start another transfer rather than have a new transfer override any previous transfers, otherwise some files might never complete in the worst case. Because of this I don't think the Transfer service will be able to accommodate your desired change in behavior. I'll forward the feature request to add the task ID to the error message, as that does seem like it would be a useful addition. |
Thanks for the clarifying comments and agreeing to documentation improvements. Thinking about it, the Lastly, I agree that allowing transfer tasks to preempt existing tasks with the same source & destination paths would in general not be desirable, I meant specifically in the case where the existing task is in a failure state. But it would be moot anyway if I had the fail on source/destination errors option. |
For a potential workaround you could look at the But it sounds like ultimately what you want here is more control over how the tasks handle errors. I believe the Transfer team originally discussed an interface for specifying which error codes are skipped/retried/fatal, but there were concerns about usability and corner cases, so we settled on the options that the CLI exposes as |
Hi,
Apologies if this has been reported before, but I couldn't find a similar issue in the issues list. The issue is that if you try to transfer a file that doesn't exist, that transfer task stays active, preventing you from transferring the file if it later comes into existence.
Steps to reproduce:
Expected behaviour:
The task is scheduled and completes successfully.
Actual behavior:
Additional information:
This is unexpected and counterintuitive behaviour compared to every file system and file transfer protocol I have worked with. I could understand why the transfer stayed active if it was reattempting the transfer periodically or if it detects that the source path becomes valid, but this does not appear to be the case.
I discovered this while running a long script with a
globus transfer
call at the end to transfer the results, but a failure in a subprocess caused the results file not to be created. So I fixed the issue with the script and reran it, only for the results to not get saved because of the error above.The only workaround is to identify the failed tasks and cancel them, but this is problematic if the user has a lot of tasks queued, especially if only a subset of them fail. If for some reason this is the intended behaviour, this should be clearly documented and an command line option provided to select the intuitive behavior, i.e. that tasks that fail become inactive/cancelled/whatever so that they can be reattempted when the file exists.
The text was updated successfully, but these errors were encountered: