[RFC] Zero DL Knowledge API #516
zachgk
started this conversation in
Development
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
One goal we can take to improve the accessibility of DJL is an API designed around those with zero knowledge of deep learning. For many users, they do not have the time nor expertise to learn deep learning and just want easy solutions. This can be an easy way for them to onboard onto DJL, and at least increase DJL name recognition.
Like AutoGluon, we should organize this API into applications. For users least familiar with deep learning, they won't even know what tasks deep learning is able to solve. Applications help narrow down the possibilities for deep learning to ones that are easy to understand.
Usage of this API extends into two avenues: training custom models and working with pre-trained models.
Training
I believe the key goal for implementing training is the contract we observe. What information should we accept for helping with the deep learning to produce a model. This contract determines how users will interact with the API and what they need to know.
From there, how we implement the contract may be less important. AutoGluon used a model selection system that trained multiple different models to see which ones work best. Instead, using a simple hard-coded recommendation would be faster (as you do not need to train multiple models) and still produce decent models. Ultimately, producing state of the art models can't be done automatically and requires dedicated scientist effort. Our aim should be more along the lines of fast successes for users who do not need the best possible models, just models that are "good enough".
For training, a large part of the goal is to include custom datasets. At that point, it only makes sense to implement this API as a library, most likely a wrapper module around DJL proper. The datasets would use the same datasets as in standard DJL.
Right now, I believe we should organize the training into applications. While it is possible to develop training across applications, it is a much more complicated task. In comparison, each application has specific models developed for it that we just have to implement. For each application, we should accept the following arguments to the training:
An example Java signature for
ImageClassification
looks like:There may be additional information needed for constructing the translator. For this, there are a few options. Either accept additional arguments. Or, use a special interface for the dataset that allows that additional information to be queried. The other benefit of those interfaces is that they can provide some additional guidance for ensuring that your dataset matches the inputs expected by the model.
Inference
Like training, using pre-trained models from our model zoo can often be very useful. This depends a lot on the application and dataset. Most applications would not be generally applicable. For example, a fraud detector only works on the actual dataset it was trained on so it wouldn't be useful for any other company. On the other hand, a text summarization model might be useful for a variety of purposes.
One of the first goals to implement the zero knowledge inference is to introduce the concepts of application. Once they understand an application, they must be able to understand their options within an application. Most importantly, they must be able to understand the datasets that most influence the models within an application. Each dataset changes where the model is applicable.
Once a user knows what trained models they are looking at, it can be difficult to decide between many competing options for a model. For example, there are many image classification or BERT NLP models that are trained on the standard datasets that can be chosen from. Only by looking at the results and with some knowledge of the history of the model can you figure out which ones to actually use in production. From here, there have been two options to build a system to explore trained models to find the ones to use.
Recommendations
One option is to have a system to provide simple recommendations. Given an application, dataset, and performance (accuracy focused, balanced, latency focused), simply provide a hardcoded recommendation about what model from the model zoo they should use.
There are several locations that this recommendation can be located. The nicest is to put it on a website as a criteria code snippet. I have also looked at including it in a library, but it would not have the visibility of a web display.
The benefit of this approach is how easy it is to get started. All we have to do is add some documentation. The downside is that it is not particularly exciting.
DJL Model Zoo Site
The other option, which we have considered for a while, is to build a model zoo service.
The first part of the service would be to expand the model metadata repository to include third-party model zoos. Similar to how anyone can upload Java jars to Maven central, we can allow anyone to upload their models to the DJL Model Zoo.
Then, we need to build a website to dynamically display the contents of our model zoo. It would be able to filter by application and dataset to help narrow down the possible choices. We would also have to have custom metadata for the zoo to help document applications and datasets for users unfamiliar with them.
Once you have the choices, we will need to have a way to compare the different model options. This means we need to record metrics and test data about the models within the model zoo (or compute it ourselves after users upload them). Then, we can show graphs of this data to decide which models work best for them.
Both this, as well as the hardcoded recommendations, can be built based on the existing model zoo design. One you choose a model, it could give you a snippet of model zoo code to load your model and a snippet of dependency code.
The benefits of this approach is definitely the ability to establish value. There isn't really a multi-framework model hub out there, so we have the ability to be the largest by including from all our included engines. It also does a much better job of displaying the variety of the DJL model zoos. Lastly, this has been a long term plan for DJL so we would really want to implement this regardless.
There are also some downsides of this option. First, it will take us much longer to get started with as it requires a considerable amount of work. We would need to build model zoo metadata exporting, a system to create accounts for uploading to the model zoo, authenticate users to validate their domain name, add measures to reliably identify models to datasets, link models to model zoo(s) that provide the Java code to support them, include metrics for model performance and latency, and build the entire frontend.
The other problem is that the marginal satisfaction that it gives users to have these choices may not always merit the complexity of the system. For users with deep learning experience, something like this is great (so it is worth doing even if we choose the library). For users without it, this option may actually decrease satisfaction due to choice overload.
Hybrid Options
In addition to the two choices, there are also a few hybrid options that attempt to combine the merits of both.
First, we could implement the hard coded option for now and then deprecate it later once the model zoo site has been built. This would let us start expanding to new users as quickly as possible to help attract DJL users. No matter how good the model zoo site is, it takes the efforts of a whole community to populate it to keep up with the latest advancements.
Another option is that we could run both for the full duration. We can leave the full simple option up and combine it with the model zoo display. From the site, we can present the recommended options in addition to the display of the model zoo models.
Beta Was this translation helpful? Give feedback.
All reactions