Skip to main content

Posts

Showing posts from October, 2025

Generating Embeddings for a RAG System

RAG (Retrieval-Augmented Generation) is one of the hottest topics in AI today. It allows you to retrieve desired outputs from a model without fine-tuning it or modifying its underlying layers and weights . Think of it like this: you send your query along with specific instructions to the AI, and it returns results that are aligned with those instructions . These instructions are what we call a prompt , and the effectiveness of your RAG system largely depends on how well you design the prompt . Another critical component of RAG is fetching relevant information from your source database . This data helps you build the prompt more effectively. To achieve this, you need to retrieve data that is contextually similar to the user’s query , a process known as semantic similarity . This is where a Vector Database comes into play. As discussed in our previous posts, vector embeddings play a crucial role here. Transformers and other NLP models accept embedded vectors of tokens as input, and the...