This repository contains a series of Jupyter notebook tutorials for the AI for Energy Justice project. The tutorials cover a range of topics, including machine learning, large language models (LLMs), web scraping, and fine-tuning LLMs.
The goal of AI for Energy Justice is to create a question answering model customized for energy justice. The students can choose any open-source LLM to base their model and fine-tune it with the reports, news articles, and bills related to energy justice. To evaluate the performance of their models, students should identify a set of questions that they can use to monitor the accuracy of their models.
Before getting started with the tutorials please create accounts with OpenAI and HuggingFace. You can then create API Keys for these accounts.
Instructions generated by prompting bard.google.com "Write step by step instructions of how to obtain OpenAI API key".
- Go to the OpenAI website: https://openai.com/.
- Click on the "Sign Up" button.
- Enter your email address and create a password.
- Click on the "Create Account" button.
- Check your email for a confirmation message from OpenAI.
- Click on the link in the confirmation message to activate your account.
- Once your account is activated, log in to the OpenAI website.
- Click on your profile icon in the top right corner of the page.
- Select "View API keys" from the dropdown menu.
- Click on the "Create New Secret Key" button.
- A new API key will be generated.
- Copy the API key and save it in a secure location.
Instructions generated by prompting bard.google.com "Write step by step instructions of how to obtain HuggingFace API key".
- Go to the Hugging Face website: https://huggingface.co/.
- Click on the "Sign Up" button.
- Enter your email address and create a password.
- Click on the "Create Account" button.
- Check your email for a confirmation message from Hugging Face.
- Click on the link in the confirmation message to activate your account.
- Once your account is activated, log in to the Hugging Face website.
- Click on your profile icon in the top right corner of the page.
- Select "Access Tokens" from the dropdown menu.
- Click on the "New Token" button.
- Give your token a name and select the "API" scope.
- Click on the "Create Token" button.
- Your API key will be generated.
- Copy the API key and save it in a secure location.
-
Machine Learning Tutorial: This tutorial provides an introduction to machine learning using the Scikit-Learn library. It covers the basics of loading data, training a model, and evaluating its performance.
-
Large Language Models Tutorial: This tutorial introduces the concept of Large Language Models (LLMs). It explains what LLMs are, how they work, and how they can be used for tasks like text generation and question answering.
-
Web Scraping Tutorial: This tutorial covers the basics of web scraping, a technique for extracting data from websites. It introduces the Beautiful Soup library and shows how it can be used to parse HTML and extract useful information.
-
Fine-tuning Tutorial: This tutorial delves into the process of customizing LLMs with additional data, a process known as "fine-tuning". It shows how fine-tuning can be used to adapt a general-purpose LLM to a specific task or domain.
Please note that these tutorials are designed to be followed in order, as each one builds on concepts introduced in the previous tutorials.
Before you start the tutorials, we recommend setting up a virtual environment and installing the required Python libraries. This ensures that the libraries don't interfere with any other Python projects you may have on your system.
Follow these steps to set up a virtual environment and install the dependencies:
-
Create a Virtual Environment: Navigate to the directory where you want to store your virtual environment, then run the following command to create a new virtual environment. Replace
env
with the name you want to give to your virtual environment.python3 -m venv env
-
Activate the Virtual Environment: Before you can start installing libraries or running Python scripts, you need to activate the virtual environment. On Windows, run:
.\\env\\Scripts\\activate
On macOS and Linux, run:
source env/bin/activate
You should now see
(env)
at the start of your command line, indicating that the virtual environment is active. -
Install Dependencies: Now that the virtual environment is active, you can install the required libraries using the
requirements.txt
file. Navigate to the directory containingrequirements.txt
, then run:pip install -r requirements.txt
-
Deactivate the Virtual Environment: Once you're done working on the project, you can deactivate the virtual environment by simply running:
deactivate
Remember to activate the virtual environment every time you work on the project, and deactivate it when you're done.
Note: I used GPT-4 to help me create the content for these tutorials.