-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional write modes for WeldxFile
#690
Comments
Thanks for bringing this up, but ain't that a real corner case? If we consider implementing more file handling modes, I suggest we stick to the nomenclature of the builtin open function: https://docs.python.org/3/library/functions.html#open |
If you restart the script, then the file will be read and validated again in "rw" mode before you get to the writing part. Due to the asdf caching, you can't even test schema modifications without restarting the script.
My case might be a corner case, but there might also be situations where a user doesn't want to modify a file and accidentally picks an existing name. In this case, an exception would be nice before it is too late ;)
👍 |
If a file exists and the user does not want to modify it, he or she should have opened it with mode="r" (the default). So there is already some sanity checks about accidentally modifying files. What am I missing? |
That the user might want to write a new file without accidentally overwriting stuff in already existing files just because he forgot to increase a number or whatever else might be the reason he used an existing file name ;) |
I guess the problem comes down to this example: # create a valid weldx file
with WeldxFile("debug.wx", tree={"a":42}, mode="rw") as wxfile:
pass
# try creating the file again with new contents
with WeldxFile("debug.wx", tree={"b":19}, mode="rw", custom_schema="single_pass_weld-0.1.0") as wxfile:
pass
>>> ValidationError: 'process' is a required property
>>> ...
>>> On instance:
>>> {'a': 42, .... } This might be a bit confusing since the validation error happens when reading the existing file. Maybe we can catch that case and raise a more specific error? To let the users know that the existing file is the problem (and they can delete it if they really want to create a file from scratch) |
It is not clear weather the ValidationError is raised upon writing the new tree (b=19) or reading the existing data, which also violates the schema. |
To me it looks like the But to be sure we should certainly run a test with valid data (or a different schema) when trying the second write operation |
We could also specify, if the schema validation should happen upon opening or writing (or both). |
In general I'm a great fan of "fail as early as possible", since it makes error tracking much easier. In this case this pattern just prohibits updating the file, if the schema is not being fulfilled. So besides of adding new file modes, I suggest to add a flag "early_validation", which is should be set to True by default (IMHO). This way it would be possible to load a non conforming files, update the tree and validate it later on during serialization/writing. I would also explicitly state, that this flag is most useful for schema development. What do you think? Anyhow I think this file modes is the one you are basically referring to achieve the overwrite the contents:
|
What do you think of the following approach: we expand the allowed inputs for custom_schema = None # no validation (default)
custom_schema = "A" # validate against A on read and write (current behavior)
custom_schema = ["A", "B"] # validate with A on read, validate with B on write
custom_schema = [None, "B"] # no validation on read, validate with B on write
custom_schema = ["A", None] # validate with A on read, no validation on write This would provide a neat wrapper when transforming/extracting data between different schemas With this implementing |
That looks very useful. I will open up a PR for this. |
Currently, we only have the modes
r
and for reading andrw
for creating a new file if it doesn't exist or modifying a file if it does exist.I suggest that we also add
w
to only write if the file doesn't exist andwf
to overwrite an existing file.Reason: When developing a file schema, the
rw
mode can have some undesired side effects. For example, you write a file, and afterward, you adjust the schema to be more strict and wouldn't allow the current data to be written. The next write attempt gives you the expected schema validation error. You correct the data and try to write it again, but you get the same validation error, even if the data seems valid. Why? BecauseWeldxFile
reads the old file first, that was valid before the schema was updated. If you don't realize this, you have a rough time figuring out why your correct data still causes a validation error (or at least seems to).So using
w
mode would raise a meaningful exception if the file already exists whilewf
(force) simply overwrites the old file without opening it.The text was updated successfully, but these errors were encountered: