diff --git a/labml_nn/__init__.py b/labml_nn/__init__.py
index 50083ee9..036964e9 100644
--- a/labml_nn/__init__.py
+++ b/labml_nn/__init__.py
@@ -21,6 +21,7 @@ contains implementations for
 and
 [relative multi-headed attention](https://lab-ml.com/labml_nn/transformers/relative_mha.html).
 
+* [GPT Architecture](https://lab-ml.com/labml_nn/transformers/gpt)
 * [kNN-LM: Generalization through Memorization](https://lab-ml.com/labml_nn/transformers/knn)
 * [Feedback Transformer](https://lab-ml.com/labml_nn/transformers/feedback)
 
diff --git a/labml_nn/transformers/__init__.py b/labml_nn/transformers/__init__.py
index 4d58bf42..1b63a55b 100644
--- a/labml_nn/transformers/__init__.py
+++ b/labml_nn/transformers/__init__.py
@@ -18,6 +18,10 @@ and derivatives and enhancements of it.
 * [Transformer Encoder and Decoder Models](models.html)
 * [Fixed positional encoding](positional_encoding.html)
 
+## [GPT Architecture](gpt)
+
+This is an implementation of GPT-2 architecture.
+
 ## [kNN-LM](knn)
 
 This is an implementation of the paper
diff --git a/readme.md b/readme.md
index c6c9d2b2..c9ae760e 100644
--- a/readme.md
+++ b/readme.md
@@ -27,6 +27,7 @@ contains implementations for
 and
 [relative multi-headed attention](https://lab-ml.com/labml_nn/transformers/relative_mha.html).
 
+* [GPT Architecture](https://lab-ml.com/labml_nn/transformers/gpt)
 * [kNN-LM: Generalization through Memorization](https://lab-ml.com/labml_nn/transformers/knn)
 * [Feedback Transformer](https://lab-ml.com/labml_nn/transformers/feedback)