Skip to content
View amimem's full-sized avatar

Block or report amimem

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
amimem/README.md

Hi there 👋 Welcome to my GitHub!

I am an AI researcher and engineer with expertise in fine-tuning large language models, reinforcement learning, and scalable models. Shoot me a message if you have any questions!

Featured Repositories

Comparing the moral jugdment of VLMs on image-query pairs with that of their underlying LLMs on description-query pairs.

Exploring architectures and methods for paramter efficient theory of mind in many agent settings.

How to abstract and extract higher agency from joint actions taken in games such as Diplomacy.

Exploration of the performance of reinforcement learning (RL) agents on the "tokens task," a decision-making test commonly used in neuroscience.

Gradient-masking with modified regulariation (l1 & l2), from the paper "Learning explanations that are hard to vary".

Pinned Loading

  1. alignment alignment Public

    Reducing Multimodal Alignment to Text-Based, Unimodal Alignment

    Jupyter Notebook

  2. ToMM ToMM Public

    Scalable Approaches for a Theory of Many Minds

    Jupyter Notebook

  3. agent_abstraction agent_abstraction Public

    Agent Abstraction in Multi-Agent Reinforcement Learning

    Jupyter Notebook

  4. tokens_task tokens_task Public

    Exploration of the performance of reinforcement learning (RL) agents on the "tokens task," a decision-making test commonly used in neuroscience.

    Jupyter Notebook 1

  5. learning-explanations-hard-to-vary learning-explanations-hard-to-vary Public

    Forked from gibipara92/learning-explanations-hard-to-vary

    Code to implement the AND-mask and geometric mean to do gradient based optimization, from the paper "Learning explanations that are hard to vary"

    Jupyter Notebook

  6. COMP767 COMP767 Public

    COMP767-Reinforcement Learning Course

    Jupyter Notebook