Skip to content

Commit

Permalink
Move file virtualization in toil-wdl-runner to task boundaries (#5028)
Browse files Browse the repository at this point in the history
* Defer file virtualization to task boundaries and consolidate decl evaluation. Also gets rid of monkeypatching in favor of a manual function call

* Implement support for importing relative URL paths; import files at startup and carry through mappings

* Fix possible invalid lookup and don't import raw URLs

* Get rid of sentinel value implementation + drop files before virtualization in TaskWrapper for MiniWDL parity

* Drop missing files during decl eval for outputs + Add a check for invalid coerced-to-null files and raise if exception found

* Deal with mypy

* Don't drop unnecesssarily

* Switch to setattr implementation

* Fix overwriting files

* Fix prototype implementation

* Fix documentation

* Resolve nested virtualize files issue by converting back to original and permanent devirtualization of URLs

* Fix virtualization of URLs that aren't toil URIs

* Mypy

* Remove documentation that no longer applies

* Fix merge conflict issue

* Enable symlink conformance test

* whitespace

* Fix virtualize/devirtualize_filename to be called only during command stdlibs and combine_bindings issue with new virtualization representation

* Add documentation and make convert_files function import greedily

* Get rid of memoization as File is not hashable (I believe inside LRU)

* Apply suggestions from code review

Co-authored-by: Adam Novak <anovak@soe.ucsc.edu>

* Update src/toil/wdl/wdltoil.py

Co-authored-by: Adam Novak <anovak@soe.ucsc.edu>

* Rename, add comments, remove unused code/comments

* Add comments and adjust wdl context usage

* add namespace

* integrate namespace into wdl_context

* properly name wdl value bases

* Remove irrelevant comment

* Adjust docstring formatting to remove RST warnings

* Adjust comment grammar

* Remove disallowed backticks

* Extract out the import function to add back memoization and resolve issue of multiple imports on the same file

* change wdl_context to wdlcontext

* also change typeddict

* Add some documentation for the WDLContext object

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Adam Novak <anovak@soe.ucsc.edu>
  • Loading branch information
3 people authored Oct 1, 2024
1 parent 4f120ac commit 0034c92
Show file tree
Hide file tree
Showing 4 changed files with 696 additions and 552 deletions.
3 changes: 3 additions & 0 deletions src/toil/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1121,6 +1121,9 @@ def _setupAutoDeployment(
logger.debug('Injecting user script %s into batch system.', userScriptResource)
self._batchSystem.setUserScript(userScriptResource)

def url_exists(self, src_uri: str) -> bool:
return self._jobStore.url_exists(self.normalize_uri(src_uri))

# Importing a file with a shared file name returns None, but without one it
# returns a file ID. Explain this to MyPy.

Expand Down
2 changes: 1 addition & 1 deletion src/toil/jobStores/googleJobStore.py
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,7 @@ def _get_blob_from_url(cls, url, exists=False):

if exists:
if not blob.exists():
raise NoSuchFileException
raise NoSuchFileException(fileName)
# sync with cloud so info like size is available
blob.reload()
return blob
Expand Down
5 changes: 2 additions & 3 deletions src/toil/test/wdl/wdltoil_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import subprocess
import unittest
from uuid import uuid4
from typing import Optional
from typing import Optional, Union

from unittest.mock import patch
from typing import Any, Dict, List, Set
Expand Down Expand Up @@ -49,11 +49,10 @@ def tearDown(self) -> None:
WDL_CONFORMANCE_TEST_COMMIT = "2d617b703a33791f75f30a9db43c3740a499cd89"
# These tests are known to require things not implemented by
# Toil and will not be run in CI.
WDL_CONFORMANCE_TESTS_UNSUPPORTED_BY_TOIL= [
WDL_CONFORMANCE_TESTS_UNSUPPORTED_BY_TOIL = [
16, # Basic object test (deprecated and removed in 1.1); MiniWDL and toil-wdl-runner do not support Objects, so this will fail if ran by them
21, # Parser: expression placeholders in strings in conditional expressions in 1.0, Cromwell style; Fails with MiniWDL and toil-wdl-runner
64, # Legacy test for as_map_as_input; It looks like MiniWDL does not have the function as_map()
72, # Symlink passthrough; see <https://github.com/DataBiosphere/toil/issues/5031>
77, # Test that array cannot coerce to a string. WDL 1.1 does not allow compound types to coerce into a string. This should return a TypeError.
]

Expand Down
Loading

0 comments on commit 0034c92

Please sign in to comment.