My content collection 🎮📚

A simple (and incomplete) collection of content I have created and shared over time. Mostly about NLP, LLMs, Information Retrieval, and Vector Search.

Title	≈ Date	type
🔮 Decoding strategies and the future of Language Models	2024-11-12	post
👩‍🏫 Banks (Python library): a Swiss Army Knife for prompting	2024-10-31	post
🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦‍⬛	2024-10-21	article + notebook
Create a 📰 Newsletter Agent with Haystack Tools 🛠️	2024-10-17	notebook + video
🧰 From my toolbox: 💬 Chat Template Viewer	2024-10-07	post
🕵🏻 Agentic RAG with 🦙 Llama 3.2 3B	2024-09-26	post + notebook
🎯 Selective fine-tuning of Language Models with Spectrum + TRL	2024-09-03	tutorial
💬🇮🇹 Phi 3.5 mini ITA: my Italian Small Language Model	2024-08-29	post + model + demo
📝 Fine-tuning LLMs: what I've learned	2024-08-26	post
🗂️⛏️ Structured data extraction with Small Language Models	2024-08-02	post + notebook
🌉 Introduction to mechanistic interpretability of LLMs	2024-08-01	post
🔎 BM42: a new ranking algorithm for hybrid RAG	2024-07-05	post + notebook
🎤 yo-Llama 🦙: altering the behavior of a LLM by amplifying a feature direction in the activation space	2024-07-01	post + model + notebook
🌌 Creating adventures with local LLMs: llamafile + Character Codex	2024-06-24	post + notebook
RAG Evaluation with 🔥 Prometheus 2	2024-06-17	blog post + notebook
⚙️ Prompt Optimization with Haystack + DSPy	2024-06-05	post + notebook
LLaMantino 3: a good 🇮🇹 Italian Language Model	2024-05-20	post
🧑‍🏫 AutoQuizzer: create a quiz from a URL and play/let the LLM play	2024-05-16	post + demo
🐍 PyCon Italy 2024 - LLM/NLP compilation	2024-05-10	post
🔎 Sparse Embedding Retrieval in Haystack	2024-04-29	post + notebook
Playing with 🦙 Llama 3 (RAG about Oscar night 🎬)	2024-04-19	post + notebook
🦙📱 Running Small Language Models on a cheap smartphone	2024-04-09	post
💎 gemma-2b-orpo-GGUF. Thoughts on quantization.	2024-04-08	post + model
💎 gemma-2b-orpo: a Small Language Model trained with ORPO	2024-03-26	post + model + notebooks
🧪📑 From raw text to structured data with LLMs and function calling	2024-03-20	post + notebook
🧭 Choosing an embedding inference solution	2024-03-10	post
LLMs 4 Devs: from 0 to your 1st LLM application @ Open Source Day	2024-03-08	talk + repository
Haystack - an open framework for LLM applications: 🎙️ interview @ Intervista Pythonista with Massimiliano Pippi, in Italian 🇮🇹	2024-03-01	podcast
Zero-day experiments with Google Gemma 💎	2024-02-21	post + notebook
🗺️🧭 What's the best LLM inference solution for me?	2024-02-20	post
🧩🧩 Merging Language Models: what I learned	2024-02-05	post
Can Language Models self-improve? 🏋️📈 - Self-Rewarding Language Models paper	2024-01-22	post
🦙 Ollama - Haystack integration	2024-01-09	post
🦙 Ollama - beyond the surface (unpolished notes) 📝	2024-01-05	post
🇮🇹🇬🇧 Multilingual RAG from a 🎧 podcast	2024-01-03	post + notebook
🧪🦍 Information Extraction with open LLMs and Function Calling	2023-12-21	post + notebook
LLM, Haystack and open-source: 🎙️ interview @ Pointer Podcast with Sara Zanzottera, in Italian 🇮🇹	2023-12-15	podcast
📌 Collection of notebooks: using Mistral models with the Haystack LLM framework	2023-12-14	post + repository
⛵ Navigating the LLM frameworks landscape: a comprehensive survey	2023-11-22	post
⚗️ How to distill the capabilities of GPT-4 into smaller models? (Zephyr report)	2023-11-13	post
Using 🪁 Zephyr models + Haystack to generate answers on your data	2023-11-06	article + notebook
🔍 Improve RAG by embedding metadata 🏷️ - pt. 2	2023-10-25	post + notebook
🔍 Improve RAG by embedding metadata 🏷️ - pt. 1	2023-10-23	post + notebook
Zephyr 7B Alpha: how I sharded a Large Language Model	2023-10-16	post
Loading unstructured data into your LLM application (unstructured.io + Haystack)	2023-10-02	post
Mistral + Haystack: how I built a 🎸 Rock RAG pipeline	2023-09-29	post + notebook
⚡ vLLM: how this fast LLM serving engine works (PagedAttention)	2023-09-26	post
Strategic Passage Ranking for RAG: how to overcome the Lost in the Middle problem	2023-08-14	post
🚀 Load LLMs in Colab 💻 using quantization	2023-07-31	post
Chat with your documents using Haystack	2023-07-31	post
Llama2 🦙 on Haystack	2023-07-20	post + notebook
Hatch: the Python 🐍 project manager	2023-07-12	post
What would mother 👩 say? (by Tuana Çelik): an agent for generating tweets	2023-07-07	post
Fact checking rocks 🎸: how to build a fact-checking system @ Berlin Buzzwords	2023-06-20	talk + repository
💊 Open-source Neural Search/LLM frameworks: an overview	2023-05-09	post
Experimenting with Haystack Agents 🕵️	2023-04-05	post + notebook
💊 Retrieval Augmented Generation (RAG)	2023-03-14	knowledge pill
💊 Extractive Question Answering Evaluation	2023-03-13	knowledge pill
Celebrating open source 🥳	2023-02-28	post
New Question Answering model for the 🇮🇹 Italian language!	2023-02-07	post
💊 Machine Reading Models (= Readers)	2023-02-05	knowledge pill
💊 Combining document retrieval and machine comprehension for Question Answering	2023-03-02	knowledge pill
💊 What is Question Answering?	2023-01-17	knowledge pill
💊 Retrieval Evaluation	2023-01-06	knowledge pill
💊 Retrieve and Re-Rank	2022-12-17	knowledge pill
💊 SentenceTransformers for Dense Retrieval	2022-12-04	knowledge pill
💊 Dense Passage Retrieval	2022-12-03	knowledge pill
💊 From sparse representations to Language Models	2022-11-26	knowledge pill
💊 Sparse retrieval: Bag-of-words, TF-IDF	2022-11-24	knowledge pill
💊 Sparse retrieval: BM25	2022-11-24	knowledge pill
💊 What is Neural Search?	2022-11-20	knowledge pill
Reduce fastText models size	2022-08-23	post
Who killed Laura Palmer? How to implement a question answering system, based on a TV series wiki @ PyCon Italia 2022	2022-06-03	talk + repository
Monkey patching in Python	2021-11-29	post
Fine-tuning transformers easily with Ludwig	2021-10-25	post
Non-invasive Python profiling with py-spy	2021-05-23	post
Compare sentences with a list of fixed words, using fastText word embeddings	2021-05-03	post
Text and metadata extraction from files with Apache Tika	2021-04-09	post
Table extraction from PDF with Camelot	2021-03-26	post
Simple Docker healthcheck	2021-03-19	post
How to detect political memes? A multimodal approach	2020-12-17	post

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.github/workflows		.github/workflows
data		data
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

My content collection 🎮📚

About

Contributors 2

Languages

anakin87/content-collection

Folders and files

Latest commit

History

Repository files navigation

My content collection 🎮📚

About

Topics

Resources

Stars

Watchers

Forks

Contributors 2

Languages