Learn RAG

2025-03-15#NOTE

# Learn about Retrieval-Augmented Generation (RAG)?

https://www.youtube.com/watch?v=T-D1OfcDW1M&t=1s

For example, what planet has the most moons? The answer from LLM would be wrong or oudated.

To make sure the accuracy, first need to retrive accurate data and mix it into the prompt that will be sent to LLM.

The overall flow is:

User Ask Question.
Retrive information based on the question.
Use retrived information to generate a prompt.
LLM gives response to user.

Last video is only a brief introduction on what RAG is.

I found this awesome video from freecodecamp for building RAG from scratch: https://www.youtube.com/watch?v=sVcwVQRHIc8.

At second step, there are actually two steps:

Indexing information (preparate for retrieval)
Find information based on Index (retrieval)

# How to Index Information

# Query Translation

The question is user query can be ambiguous. The accuracy of information retrieval will also be impossible.

To tackle this problem, we need to rewrite the user question with the techques below.

## Multi Query

Rewrite the original question with different perspectives to devide the question the multiple questions.

The idea behind is spliting into few more questions to increase the chance for matching documents.

To match more documents, will run a retriever for each of the question.

## RAG Fusion

Sounds like it is based on Multi-Query method and add a rank filter on top of it.

For example, will only use the top 2 retrived documents or something like that...

## Decomposition

Split the original question into multiple questions using an algorithm from Google.

What algorithm?

Then execute the flow of the first question (retrieval, prompt, get answer from LLM). Then use the answer for the first question as the context for the next question.

Keep doing so, connect the flow of questions into a sequence.

🩷

👍

😄

🙁