-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] 4. Opportunities for Foundation Models #5
Comments
Hi @Sazan-Mahbub 👋, if you want any help with this section let me know. My main relevant expertise is in leveraging NLP foundation models for building interpretable models, that would maybe including writing about things like:
|
Hi @csinva, thanks for reaching out. That sounds awesome! I believe your expertise and the points you mentioned will be very helpful for this section. @blengerich Could you kindly add Chandan to the slack? Thanks! |
@Sazan-Mahbub Done! |
Started drafting some stuff in this gdoc (thought this might be easier while we're working out high-level details but happy to switch to PRs if yall prefer). Current content (see the gdoc for updated edits): Foundation models are flexible models that are trained on broad data that can be adapted to a wide range of downstream tasks; they are generally transformer-based neural networks trained using self-supervision at scale [@doi:10.48550/arXiv.2108.07258]. Foundation models have made immense progress in recent years, particularly in natural-language processing, where large language models (LLMs) such as GPT-4 [@doi:10.48550/arXiv.2303.08774] and LLaMA-3.1 [@doi:10.48550/arXiv.2407.21783] have demonstrated impressive capabilities across a variety of tasks. Similarly, foundation models have excelled in various domains, including with text-vision models [@doi.org/10.48550/arXiv.2103.00020], text embedding models [@doi:10.48550/arXiv.1810.04805], and tabular data [@doi:10.48550/arXiv.2207.01848]. Modern foundation models have many connections with ContextualizedML. For example, LLMs rely on different types of contextualization to adapt to new samples. The main way that users interact with LLMs requires prompting the LLM, i.e. passing the LLM a query that specifies the desired behavior from the LLM [@doi:10.1145/3560815]. This query contextualizes the LLM to perform a particular computation, effectively changing the model parameters in order to respond in the desired fashion. In a similar vein, the popular mixture-of-experts architecture [@doi:10.1007/s10462-012-9338-y] uses contextualization as a means to achieve efficiency by learning a routing function to select which context-specific model to apply for parts of an input. Foundation models have properties that can make them useful when integrated into ContextualizedML. One example is at the feature level, where foundation models can be used to build structured features from unstructured data that are then amenable to interpretation. For example, an LLM can be prompted to yield interpretable features by asking questions about a piece of text [@doi:10.48550/arXiv.2302.12343, @doi:10.48550/arXiv.2305.12696, @doi:10.18653/v1/2023.emnlp-main.384]. A second level is at the representation level, whereby foundation models can provide a contextualized base to learn models. For example, prompting models can yield contextualized interpretable models, e.g. linear models or decision trees [@doi: 10.48550/arXiv.2208.01066]. Prompted embeddings can provide contextualization for training black-box models [@doi:10.48550/arXiv.2212.09741] or ngram-based models [@doi:10.1038/s41467-023-43713-1]. Finally, foundation models can be used to help provide posthoc descriptions for black-box models resembling [contextualization. For example, different regions of a model’s input can be given natural-language parameters [@doi:10.48550/arXiv.2409.08466]. |
Tagging @Sazan-Mahbub for attention |
Thank you @csinva! Sorry for the delay, I have a few deadlines in the next few days, I will get back to it after that. Please feel free to take the lead in this if that's okay. |
@Sazan-Mahbub has volunteered to lead this section. It may grow to include others' contributions as well.
The text was updated successfully, but these errors were encountered: