-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReturnnConfig ignores file caching #310
Comments
If it is expected behavior, I think at least the documentation in But then somewhat my open question remains: How do I do caching here? I see that other people often use caching via RETURNN, e.g. |
The described problem that it's written in an external file and read on a different node requires of cause support by the used toolkit, like in the given example with RASR, but that isn't possible with most toolkits. So I don't have any suggestions for the write on one node read on another setup, changing the default behavior of Any other opinions? Side note: I considered a while ago to deprecating the current |
I'm a bit afraid that changing the behavior of We discussed this a bit more internally. And it looks like there is no real good generic solution to this problem. For args = [
_DelayedCodeFormat("lambda: '--config=' + cf({!r})", files["config"]),
_DelayedCodeFormat("lambda: '--*.corpus.file=' + cf({!r})", files["corpus"]),
_DelayedCodeFormat(
"lambda: '--*.corpus.segments.file=' + cf({!r})", (files["segments"] if "segments" in files else "")),
_DelayedCodeFormat("lambda: '--*.feature-cache-path=' + cf({!r})", files["features"]),
_DelayedCodeFormat("lambda: '--*.feature-extraction.file=' + cf({!r})", files["feature_extraction_config"]),
"--*.log-channel.file=/dev/null",
"--*.window-size=1",
]
args += [
"--*.corpus.segment-order-shuffle=true",
"--*.segment-order-sort-by-time-length=true",
"--*.segment-order-sort-by-time-length-chunk-size=%i" % {"train": epoch_split * 1000}.get(data, -1),
]
d = {
"class": "ExternSprintDataset",
"sprintTrainerExecPath": tools_paths.get_rasr_exe("nn-trainer"),
"sprintConfigStr": args,
"suppress_load_seqs_print": True, # less verbose
"input_stddev": 3.,
"orth_vocab": self.vocab.get_opts() if self.vocab else None,
} And: class _DelayedCodeFormat(DelayedFormat):
"""Delayed code"""
def get(self) -> CodeWrapper:
"""get"""
return CodeWrapper(super(_DelayedCodeFormat, self).get()) |
I don't understand what is the issue here: returnn does not support (in general) the "cf /path/to/file" file caching method we commonly use, so ignoring it is the correct solution at the moment. Also the structure of the recipes (writing the config on a different node) does not support file caching where sisyphus would do the copying. So ignoring it is the correct solution. I propose two solutions to enable general file caching: |
Well, one thing is just that the behavior to me was unexpected (that the But you are right, the current way the However, there are potential solutions to still do it in a generic way.
This would be tricky, or basically almost impossible, as there is no central place for file handling in RETURNN. It's rather spread over the whole code, and even some other user code could do file handling. And then also TensorFlow and other libraries would do file handling.
This is similar to my solution now. I don't do this automatically but rather manually. It would have been a bit tricky to do automatically, as it would have to go recursively through objects, e.g. |
Yes, the user manually taking care of it however they want is of cause also an option.
I see. But then there is the question of whether or not the addition of |
When you put a
tk.Path(..., cached=True)
object somewhere into yourReturnnConfig
, it will not use file caching.Is this expected behavior? At least I did not expect this.
The reason for that:
This code recursively goes through
config
, and does:Now,
tk.Path
derives fromDelayedBase
, and has thisget
:And
get_path
returns the uncached file path.So, in the written RETURNN config, you will not have any hashed file paths.
(Btw, in the
RasrConfig
, I think it callsstr(path)
instead, andPath.__str__
returnsself.get_cached_path()
.)Another note: I just realize, the RETURNN config writing is also a separate
Task
anyway, which likely runs on a different node, so file caching could not be done directly there. However, the commonfile_caching
function is implemented like:So, it anyway does not return the file path of a cached file, but just this special formatted string, which only RASR can really handle properly. So
Path.__str__
likely would break other tools.I don't really have a good solution or suggestion currently.
The text was updated successfully, but these errors were encountered: