ML Spring
Posts
Prompting Vs RAGs Vs Finetuning

Prompting Vs RAGs Vs Finetuning

Understanding how AI Engineers make decisions❓

Akshay Pachaar
December 14, 2023

Prompting vs RAGs vs Fine-tuning!

An important decision that every AI Engineer must make when building an LLM-based application.

To understand what guides the decision, let's first understand the meaning of these terms.

(refer the image below as you read each section)

1️⃣ Prompting Engineering:

The prompt is the text input that you provide, based on which the LLM generates a response.

It's basically a refined input to guide the model's output.

The output will be based on the existing knowledge the LLMs has.

Read my guide on Prompt engineering here!

2️⃣ RAGs (Retrieval-Augmented Generation):

When you combine prompt engineering with database querying for context-rich answers, we call it RAG.

The generated output will be based on the knowledge available in the database.

Read my intro to RAGs & llamaindex here!

3️⃣ Finetuning

Finetuning means adjusting parameters of the LLM using task-specific data, to specialise in a certain domain.

For instance, a language model could be finetuned on medical texts to become more adept at answering healthcare-related questions.

It's like giving additional training to an already skilled worker to make them an expert in a particular area.

Back to the important question, how do we decide what approach should be taken!

(refer the image below as you read ahead)

❗️There are two important guiding parameters, first one is Requirement of external knowledge, second is requirements of model adaptation.

❗️While the meaning of former is clear, model adaption means changing the behaviour of model, it's vocabulary, writing style etc.

For example: a pretrained LLM might find it challenging to summarize the transcripts of company meetings, because they might be using some internal vocabulary in between.

🔹So finetuning is more about changing structure (behaviour) than knowledge, while it's other way round for RAGs.

🔸You use RAGs when you want to generate outputs grounded to a custom knowledge base while the vocabulary & writing style of the LLM remains same.

🔹If you don't need either of them, prompt engineering is the way to go.

🔸And if your application need both custom knowledge & change in the behaviour of model a hybrid (RAGs + Finetuning) is preferred.

Read my hands on guide to finetuning using LoRA here!

A big shout-out to AbacusAI for supporting my work.

They are hosting a FREE webinar on Retrieval APIs, Chat LLMs & AI Agents: Build LLM Apps at Scale

RSVP here: http://tinyurl.com/RAGsAPI

The image below captures the essence of what we discussed so far!

Prompting Vs RAGs Vs Finetuning

Thanks for reading! 🥂

Reply

or to participate.