• ML Spring
  • Posts
  • RAGs-101: An introduction to llamaindex

RAGs-101: An introduction to llamaindex

Don't let your LLMs hallucinate! 🚀

LLMs have taken the world by storm, they are powerful, versatile and have applications in a wide variety of fields.

However, there's a big downside called hallucination. This basically means that LLMs can produce wrong outputs, make up facts, and their knowledge is limited to the data they were trained on.

This is where Retrieval Augmented Generation (RAGs) becomes Important.

In simple words RAG is a technique to ground your LLMs to generate responses to your queries based on a custom knowledge-base that you provide.

Now multiple question might arise:

  • What is the meaning of custom knowledge base?

  • What kind of data it can have?

  • How is this data stored and fed to an LLM?

  • And what it takes to build such a system?

So without any further adieu, let’s understand a typical RAG architecture:

RAG Architecture

We will understand one by one the importance of each of these components with example.

We will use LlamaIndex to build our custom knowledge base and then query it using an LLM (you can chose nay). LlamaIndex is a simple, flexible data framework for connecting custom data sources to LLMs.

Check this:

Make sure you have your OpenAI API keys and llamaindex installed for this tutorial.

Make sure you have your API key stored in a .env file and is accessible as an environment variable, create another file starter.py & add the following code to it.

import os
from dotenv import load_dotenv

# Load the .env file
load_dotenv()

# Retrieve the OPENAI_API_KEY
openai_api_key = os.getenv('OPENAI_API_KEY')

For this tutorial I will be grounding the LLM on my resume (custom knowledge).
Your directory structure should look like this:

├── starter.py
└── data
    └── your_resume.txt

llamaindex has an inbuilt data loader for documents stored in a given directory called SimpleDirectoryReader(), this is how it works:

from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()

The next step involves ingesting, indexing, and storing this document in a vector database.

For this purpose, we will utilize FAISS DB, as we require an in-memory vector database, which is open-source & FREE.

FAISS is not only convenient to use but also highly suitable for our needs. Therefore, we have created a function called get_vector_idx() specifically for this task.

Check this:

from llama_index.llms import OpenAI
from llama_index import VectorStoreIndex, ServiceContext, StorageContext
from llama_index.vector_stores import FaissVectorStore
import faiss

def get_vector_idx(documents):
    llm = OpenAI(model="gpt-3.5-turbo", api_key=OPENAI_API_KEY)
    faiss_index = faiss.IndexFlatL2()
    vector_store = FaissVectorStore(faiss_index=faiss_index)

    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    service_context = ServiceContext.from_defaults(llm=llm) 

    index = VectorStoreIndex.from_documents(documents, 
        service_context=service_context,
        storage_context=storage_context
    )
   
    return index


index = get_vector_idx(documents)

This builds an index over the documents in the data folder (which in this case just consists of a text file that is based on my resume, but could contain many documents).

StorageContext() is used to define the vector database where you store the embeddings for your custom knowledge base, ServiceContext() is used to define the LLM that is used for creating these embeddings from the raw text. In this case we have used FAISS & GPT-3 as our vectorDB & LLM respectively.

Now once we have the data indexed, it’s fairly easy to query it, let’s see how it’s done:

query_engine = index.as_query_engine()
response = query_engine.query("Where did Akshay go for college and what did he study?")
print(response)

This is what it prints:

Akshay Pachaar attended Birla Institute of Technology & Science, Pilani for his college education. He studied M.Sc. (Hons) Mathematics & B.E. (Hons) Electrical and Electronics.

That’s all, this is what a typical RAG application looks like, the idea is to always keep things simple, in next tutorial we will ground our LLM on a GitHub repository of best Kaggle solution & tricks & create a Streamlit app using this.

See you next time!

Subscribe to keep reading

This content is free, but you must be subscribed to ML Spring to continue reading.

Already a subscriber?Sign In.Not now