- ML Spring
- Posts
- Attention is all you need: Self Attention Clearly explained!
Attention is all you need: Self Attention Clearly explained!
An illustrated guide! โ๏ธ Author: @akshay_pachaar
In 2017, a groundbreaking paper titled "Attention is All You Need" introduced the transformer architecture, which led to the Large Language Model (LLMs) revolution that we witness today.
At the heart of this architecture lies the attention mechanism.
In this post, I'll clearly explain self-attention & how it can be thought of as a directed graph.
Before we start a quick primer on tokenization!
Raw text โ Tokenization โ Embedding โ Model
Embedding is a meaningful representation of each token (roughly a word) using a bunch of numbers.
This embedding is what we provide as an input to our language models.
Check this๐
Reply