I'm Mustafa Shoukat, a Generative Expert. I'm exploring various concepts of LangChain and techniques to enhance my skills.
"I am just a humble data practitioner. I make mistakes and I have blind spots. If you notice things I can improve or if you just want to chat, please feel free to DM me or connect :)"
About Notebook: 🧠 Integrating LangChain with HuggingFace and FAISS for Advanced NLP Applications
This notebook explores building a Newsbot with LangChain. It covers data preprocessing, chat model creation, document retrieval with FAISS, and question answering. Through practical examples, you'll learn to integrate these components into a context-aware chatbot system, providing a streamlined guide to developing advanced NLP applications.
Name | GitHub | Kaggle | ||
---|---|---|---|---|
Mustafa Shoukat | mustafashoukat.ai@gmail.com |
Note: Include your API key in the sections where it is required. For instance, you might include your OpenAI API key or LangChain Hugging Face API key in the necessary code cells.
NewsBot is a web-based tool designed to analyze news articles and provide insights based on the content. Built with Streamlit and LangChain, it allows users to input URLs of news articles, process the content, and retrieve answers to queries along with the sources.
- Process News Articles: Enter up to 3 URLs of news articles for analysis.
- Chunk Text and Create Embeddings: The tool splits the article text into manageable chunks and creates embeddings for effective retrieval.
- Query-Based Insights: Ask questions based on the processed articles and receive answers along with the sources.
- Python 3.12.0
- Streamlit
- LangChain
- FAISS
- OpenAI (requires an API key)
- Python-dotenv
-
Clone the Repository:
git clone https://github.com/your-username/newsbot.git cd newsbot
-
Set Up a Virtual Environment (optional but recommended):
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
-
Install Dependencies:
pip install -r requirements.txt
-
Set Up Environment Variables: Create a
.env
file in the root directory and add your OpenAI API key:OPENAI_API_KEY=your_openai_api_key
-
Run the Streamlit App:
streamlit run app.py
-
Interact with the App:
- Open the app in your browser (the URL will be provided in the terminal).
- Enter up to 3 news article URLs in the sidebar.
- Click the "Process URLs" button to analyze the articles.
- Enter your query in the text input field to get insights based on the processed content.
app.py
: The main Streamlit application script.faiss_store_openai.pkl
: The FAISS index file used for storing embeddings (generated after processing URLs)..env
: Environment variables file (do not forget to add your OpenAI API key here).requirements.txt
: Python package dependencies.
Contributions are welcome! Please fork the repository and create a pull request with your changes. Make sure to follow the coding standards and write tests for your changes.
For any questions or support, please reach out to Mustafa Shoukat.
Feel free to replace Mustafa-Shoukat1
in the clone URL with your GitHub username and add any additional information relevant to your project.
Note: Include your API key in the sections where it is required. For instance, you might include your OpenAI API key or LangChain Hugging Face API key in the necessary code cells.
Thank you for taking the time to explore this notebook. I hope you find the information and examples useful in your journey to mastering advanced NLP techniques with LangChain, HuggingFace, and FAISS. If you have any questions, feedback, or suggestions, please feel free to reach out.
Your support and interest in this project are greatly appreciated. Stay curious and keep learning!
Happy coding!