-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fully replace WorkflowTask.args, in patch-WorkflowTask endpoint #758
Comments
Hmm, agreed, it doesn't break old behaviors (pre 1.3.0). It's very optimized for Fractal web and removes some improvements in 1.3.0 though of having the workflow json / workflow task contain all the ground-truth of all parameters that would be run if a user uses the CLI client though (see below for explanation).
Hmm, that behavior does seem a bit tricky, but I guess it wont be an issue on the fractal web side. At first, it seems a bit weird that a user would "unset" a parameter (e.g. remove it from the parameter list or set it to None), but the server then just uses the default value for it, without the user being able to see what that default value is. It's less optimal on the CLI side: If the cellpose task has 20 parameters with default values, upon initially adding the task, all those default values would now get set in the workflow task, right (improvement of 1.3.0 fractal server)? But if a user then provides just 3 new values or so (e.g. Big picture, this potentially moves ground-truth away from the workflow task & the workflow json export (in the CLI case), which I'm not the biggest fan of. It is similar to the pre 1.3.0 state though, so I could live with that change for the time being and we can consider an update to the CLI API later. The web should be the main way of building workflows and there it's fine. |
tldr: I think this new API makes a lot of sense for fractal web and is what it would probably always want to use. It's also compatible with 1.2.0 CLI behavior. |
I generally agree with these comments. It would be indeed a change which is optimized for fractal-web, where you wouldn't remove an argument by accident (that is, just because you did not include it in a API call). Note that this holds both with/without schemas, because also when you have no schema fractal-web always shows the current values and you would need to explicitly remove them if you want to. Why we need to change the PATCH endpointHowever, the purely "incremental" behavior of the PATCH endpoint has to change, IMO. Main reasons are not even related to default values:
At least for these reasons, I think we have to move forward. Handling default valuesA different question concerns the defaults and our procedure for setting them. Let's consider the two cases without/with schemas. Task without schemaIf we have no JSON Schemas, then there is not much to say. The defaults do not exist in fractal-server or in the DB or in fractal-web, but they only exist in the task Python function. Task with schemaIn this case we already have multiple sources of truth: the function and the JSON schema (when present). I think we should work under the assumption that the JSON Schema was produced correctly, and that it is not something we want to update often (it could happen, during development, but then one should make sure to review it by hand). New proposalMy goal is to:
Here is a new proposal for the PATCH-workflowtask endpoint which may work:
This has some similarity with old versions (1.2), where we would dynamically merge the Task.default_args (now replaced by the schema) with the WorkflowTask.args, with an important difference. In 1.2, this merge would happen during execution of the workflow, while in 1.3 this merge only happens as part of the POST/PATCH-workflowtask endpoints and is written in the DB. To me, this looks like an important improvement in terms of making it clear what the server will use as task parameters. |
I think this use case would be covered by the new proposal, right @jluethi? |
Agreed!
Yes, great summary! That's what I really like about the 1.3.0 improvements as we had them for the CLI so far, merging upon Post/PATCH, not execution!
Ah, that's a very nice way of maintaining this new functionality! How tricky is this feature? I'm generally a big fan of this way. It wouldn't be a hard requirement for 1.3.0 to have this, in case there are edge cases we need to consider here. But if it's straightforward, then this allows us to go to the new args replacement while maintaining the merging at POST/PATCH, not at execution. |
Quickly thinking through some implications of the "If there is a schema, we scan its properties and for all properties that are not set in args and do have a default in the schema, we set their value to the default" workflow: From the command line side, this seems to solve my issue I raised above, because it "reintroduces" the benefits of incremental updates through using the schema. Conclusion: I don't see issues with this approach that I can think of :) |
Correct, the new proposal would handle this well! |
It should be straightforward. We already do this in the POST endpoint, and we need to do the same in the PATCH endpoint, that's it. db_task = await db.get(Task, task_id)
default_args = {}
if db_task.args_schema is not None:
try:
properties = db_task.args_schema["properties"]
for prop_name, prop_schema in properties.items():
default_value = prop_schema.get("default", None)
if default_value:
default_args[prop_name] = default_value
except KeyError as e:
logging.warning(
"Cannot set default_args from args_schema="
f"{json.dumps(db_task.args_schema)}\n"
f"Original KeyError: {str(e)}"
)
# Override default_args with args
actual_args = default_args.copy()
if args is not None:
for k, v in args.items():
actual_args[k] = v
if not actual_args:
actual_args = None You can immediately see that we are essentially re-building the In the end we are really doing the same thing as in 1.2, in that we first set the default values and then override them with the user-provided values. The big change in upcoming 1.3.0, with respect to 1.2.0, will be that we do this only when acting on the DB (through the POST/PATCH endpoints), and then we store the output of this merge in the DB itself. |
Agreed
They can unset an optional argument
|
This was just to say it out loud, not to say that anything is wrong. At the moment I'm not aware of any glitch in the new proposal. |
This sounds great!
Sounds great! => Your new proposal sounds good, let's go with that! |
…lace-workflowtaskargs-in-patch-workflowtask-endpoint Fully replace WorkflowTask.args, in patch endpoint (close #758)
That was fast! |
It was already set up, I just needed some minor updates and tests ;) |
Current status
The patch-workflowtask endpoint only updates the
args
attribute in an incremental way, seeProposed update
When the patch-workflowtask endpoint receives a
args
attribute, this one should fully replace the existingworkflowtask.args
value (rather than updating it). This will provide more flexibility to fractal-web, and unlock fractal-analytics-platform/fractal-web#196 (@rkpasia).(In passing, it will also contribute to #5)
Use cases
Let's consider this task function
Custom task
default_y
string does not appear anywhere in the database, when the Task is imported as a WorkflowTask.WorkflowTask.args
at will. Ifx
is missing, the task will fail. Ify
is missing, the task will use"default_y"
. Ify
is set to"some_y"
, the task will use"some_y"
.Common task
WorkflowTask.args
is set to{"y": "default_y"}
.args["y"]
and set it to"some_y"
, and call the patch-workflowtask endpoint (note: they should also include a value forx
). The task will then run with"some_y"
."y"
entry fromargs
through the CLI client. The task will then run with the python-function default value"default_y"
."y"
is not present inWorkflowTask.args
, then upon loading the WorkflowTask in fractal-web this parameter will be automatically filled with the default value. This means that after any interaction with fractal-web they
parameter will be either set to a custom value or to its default value.A (uncommon) use case that would not be supported any more
The following use case currently would work, but it won't work any more if we move as described in this issue:
x
parameter via the CLI clienty
parameter via the CLI client (with a new client call, i.e. with a new API call)x
andy
are now up-to-date in the DB.I don't think anyone was relying on this incremental-update behavior of the patch-WorkflowTask endpoint - right @jluethi? [or, to be more explicit, I don't think the patch-WorkflowTask endpoint is used often in the CLI client]. And I'm not fully sure of why we implemented this behavior in the first place, since ex-post it does not seem very intuitive.
The text was updated successfully, but these errors were encountered: