Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] 4. Opportunities for Foundation Models #5

Open
blengerich opened this issue Aug 25, 2024 · 6 comments
Open

[WIP] 4. Opportunities for Foundation Models #5

blengerich opened this issue Aug 25, 2024 · 6 comments
Assignees

Comments

@blengerich
Copy link
Contributor

@Sazan-Mahbub has volunteered to lead this section. It may grow to include others' contributions as well.

@csinva
Copy link

csinva commented Aug 30, 2024

Hi @Sazan-Mahbub 👋, if you want any help with this section let me know. My main relevant expertise is in leveraging NLP foundation models for building interpretable models, that would maybe including writing about things like:

  • connections between contextualizedML, MoE, & other sample-specific contextualization schemes
  • embedding contextualization for building interpretable classifiers
  • in-context learning adaptation for distilling interpretable models
  • leveraging foundation models to contextualize features in a simple model

@Sazan-Mahbub
Copy link
Collaborator

Hi @Sazan-Mahbub 👋, if you want any help with this section let me know. My main relevant expertise is in leveraging NLP foundation models for building interpretable models, that would maybe including writing about things like:

  • connections between contextualizedML, MoE, & other sample-specific contextualization schemes
  • embedding contextualization for building interpretable classifiers
  • in-context learning adaptation for distilling interpretable models
  • leveraging foundation models to contextualize features in a simple model

Hi @csinva, thanks for reaching out.

That sounds awesome! I believe your expertise and the points you mentioned will be very helpful for this section. @blengerich Could you kindly add Chandan to the slack? Thanks!

@blengerich
Copy link
Contributor Author

@Sazan-Mahbub Done!

@csinva
Copy link

csinva commented Sep 20, 2024

Started drafting some stuff in this gdoc (thought this might be easier while we're working out high-level details but happy to switch to PRs if yall prefer).

Current content (see the gdoc for updated edits):

Foundation models are flexible models that are trained on broad data that can be adapted to a wide range of downstream tasks; they are generally transformer-based neural networks trained using self-supervision at scale [@doi:10.48550/arXiv.2108.07258]. Foundation models have made immense progress in recent years, particularly in natural-language processing, where large language models (LLMs) such as GPT-4 [@doi:10.48550/arXiv.2303.08774] and LLaMA-3.1 [@doi:10.48550/arXiv.2407.21783] have demonstrated impressive capabilities across a variety of tasks. Similarly, foundation models have excelled in various domains, including with text-vision models [@doi.org/10.48550/arXiv.2103.00020], text embedding models [@doi:10.48550/arXiv.1810.04805], and tabular data [@doi:10.48550/arXiv.2207.01848].

Modern foundation models have many connections with ContextualizedML. For example, LLMs rely on different types of contextualization to adapt to new samples. The main way that users interact with LLMs requires prompting the LLM, i.e. passing the LLM a query that specifies the desired behavior from the LLM [@doi:10.1145/3560815]. This query contextualizes the LLM to perform a particular computation, effectively changing the model parameters in order to respond in the desired fashion. In a similar vein, the popular mixture-of-experts architecture [@doi:10.1007/s10462-012-9338-y] uses contextualization as a means to achieve efficiency by learning a routing function to select which context-specific model to apply for parts of an input.

Foundation models have properties that can make them useful when integrated into ContextualizedML. One example is at the feature level, where foundation models can be used to build structured features from unstructured data that are then amenable to interpretation. For example, an LLM can be prompted to yield interpretable features by asking questions about a piece of text [@doi:10.48550/arXiv.2302.12343, @doi:10.48550/arXiv.2305.12696, @doi:10.18653/v1/2023.emnlp-main.384]. A second level is at the representation level, whereby foundation models can provide a contextualized base to learn models. For example, prompting models can yield contextualized interpretable models, e.g. linear models or decision trees [@doi: 10.48550/arXiv.2208.01066]. Prompted embeddings can provide contextualization for training black-box models [@doi:10.48550/arXiv.2212.09741] or ngram-based models [@doi:10.1038/s41467-023-43713-1]. Finally, foundation models can be used to help provide posthoc descriptions for black-box models resembling [contextualization. For example, different regions of a model’s input can be given natural-language parameters [@doi:10.48550/arXiv.2409.08466].

@blengerich
Copy link
Contributor Author

Tagging @Sazan-Mahbub for attention

@Sazan-Mahbub
Copy link
Collaborator

Thank you @csinva! Sorry for the delay, I have a few deadlines in the next few days, I will get back to it after that. Please feel free to take the lead in this if that's okay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants