Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make custom tasks clearer and more accessible #639

Open
6 tasks
JulienVig opened this issue Feb 28, 2024 · 6 comments
Open
6 tasks

Make custom tasks clearer and more accessible #639

JulienVig opened this issue Feb 28, 2024 · 6 comments
Labels
discojs Related to Disco.js documentation Improvements or additions to documentation server Related to the server web client Related to the browser environment

Comments

@JulienVig
Copy link
Collaborator

JulienVig commented Feb 28, 2024

#690 should be addressed first.

Adding a custom task is currently very convoluted, and the process is not documented enough.

For example, the no-code UI for adding a custom task requires users to choose parameter values of learning rate, DP sensitivity, gradient clipping etc which is certainly beyond the grasp of users without technical knowledge. Furthermore, users have to upload their own model which requires coding to some extent.

  • web-client
    • Create a step-by-step guide to add a custom task from the UI, including how to create an initial tfjs model. Reference it from the UI custom task page.
    • Add more explanations for the custom task training and privacy parameters (e.g. on hover)
  • docs
    • Most of the doc is currently in docs/Task.md, think about splitting information depending on how users are (non-technical, technical users, contributors) into different documents and reference them appropriately.
    • Add clear links to the repo documentation such that a non-technical user can easily find how to add a custom task from the UI
  • server and discojs
    • @peacefulotter mentioned that the Task implementation could be improved, think about a potential rework with the ease of use and ease of adding user's custom task in mind.
  • discojs-node
    • Currently, the only way to add a custom task from a script is to add it to the server before startup, which requires being the one handling the server. I think that being able to add a task as a user (i.e. using only discojs-node) should be possible.
@JulienVig JulienVig added documentation Improvements or additions to documentation web client Related to the browser environment discojs Related to Disco.js server Related to the server labels Feb 28, 2024
@tharvik
Copy link
Collaborator

tharvik commented Feb 28, 2024

a custom task requires users to choose parameter values of learning rate, DP sensitivity, gradient clipping

that's clearly an issue, creating a Task is way too complex, there are so much fields, some very technicals, some that are only related to image or tabular, some that are unused even. it should be as easy as Task { id: 'titanic', model: ... } for the basic cases, with a maximum of fields having sane defaults.

Furthermore, users have to upload their own model which requires coding to some extent.

I don't think we can really avoid that but a guide will indeed help a lot to do so.

@peacefulotter
Copy link
Collaborator

Thanks for tagging me, would love to brainstorm / contribute for the task refactoring if needed

@JulienVig
Copy link
Collaborator Author

Related to #647

@tharvik
Copy link
Collaborator

tharvik commented Mar 6, 2024

Related to #647

indeed, moving my (now deleted) response from there

what's the purpose of tasks in Disco and what they are,

clearly agree, it's currently used as an global context object everywhere, with way too many fields. you can see some very early draft of a split of TrainingInformation @ https://github.com/epfml/disco/blob/647-split-tasks-tharvik/discojs/discojs-core/src/task/training_information.ts
I'll try a definition of what I think it should be, discussion very welcome: a Task is a Model with a description. it's what users of disco will participate in, by adding data (training) or predicting with.
for now, it's also containing model specific config (mv to Model init), dataset config (mv to Model inputs type and dataset types) and network config (maybe showing which network are currently running).
I also think that TaskProvider functionnality should be renamed to Task, and the old Task would be more precisely Config.

especially what are the use cases for pre-defined tasks

pre-defined task is for to showoff the possibilities of disco itself, but there is clearly no need to uncumber discojs-core with it. it should be only available for trying out/demo purposes. when put outside, it will effectively be the same as adding custom tasks.

and custom tasks

custom task would be for specific uses of disco (such as the various bilateral projects that's being develop at MLO).

@JulienVig
Copy link
Collaborator Author

I think what I find confusing is that the work "task" is usually used at a higher level in machine learning. For example, the first google result of "machine learning tasks" talks about "binary classification", "regression", "clustering" etc. Similarly, for LLMs tasks refer to problems like "question answering" or "summarizing".
In contrast, a Disco task is much more specific as you said, it specifies the model, the config, the dataset, the distributed learning scheme etc.
Similarly,

a Task is a Model with a description. it's what users of disco will participate in, by adding data (training) or predicting with.

feels till too specific for the word "Task".

I'm also realizing that there is probably some confusion in our discussion between the concept of a Task in the user interface, and the actual Task object in the code base.

Talking about the user interface concepts, what do you think of:

  • Re-defining "Pre-defined tasks" as "Examples" or "Demo" or "Showcase"
  • Re-defining "Custom task" as "Training Session" or "Session" or "DisCollaborative" (this term is used in the homepage and I quite like it even though it's not used anywhere else). These terms feel more like a problem instance of a task and more specific
    As such, the word task is not used anymore in the UI (but still available if needed). And from there we can define programming objects that match the higher-level concepts and that are coherent. What do you think?

@tharvik
Copy link
Collaborator

tharvik commented Mar 7, 2024

confusing is that the work "task"

yeah, that's way too generic. I think we're hitting the hardest problem in computer science.

I'm also realizing that there is probably some confusion in our discussion between the concept of a Task in the user interface, and the actual Task object in the code base.

ho right, thanks for noticing it. I was indeed only viewing via the discojs-core's Task class.

Talking about the user interface concepts, what do you think of:

  • Re-defining "Pre-defined tasks" as "Examples" or "Demo" or "Showcase"

"examples" sounds good, "demo" feels as an guided/tutorial experience which is not the case, "showcase" is nice too. having a plural name (as "examples") will probably help us a bit more (how one can talk a specific element of a "showcase"?).

Talking about the user interface concepts, what do you think of:

  • Re-defining "Custom task" as "Training Session" or "Session" or "DisCollaborative" (this term is used in the homepage and I quite like it even though it's not used anywhere else). These terms feel more like a problem instance of a task and more specific

I really like "DisCollaborative"! *"session" make it feel temporary which is not the case for most of theses models (especially with continuous learning).

As such, the word task is not used anymore in the UI (but still available if needed). And from there we can define programming objects that match the higher-level concepts and that are coherent. What do you think?

yep, makes total sense, I'm keeping Task for now in code (as I didn't find really another name at the moment), hopefully, cleaning up will help getting a clearer picture/name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discojs Related to Disco.js documentation Improvements or additions to documentation server Related to the server web client Related to the browser environment
Projects
None yet
Development

No branches or pull requests

3 participants