diff --git a/labml_nn/__init__.py b/labml_nn/__init__.py index 50083ee9..036964e9 100644 --- a/labml_nn/__init__.py +++ b/labml_nn/__init__.py @@ -21,6 +21,7 @@ contains implementations for and [relative multi-headed attention](https://lab-ml.com/labml_nn/transformers/relative_mha.html). +* [GPT Architecture](https://lab-ml.com/labml_nn/transformers/gpt) * [kNN-LM: Generalization through Memorization](https://lab-ml.com/labml_nn/transformers/knn) * [Feedback Transformer](https://lab-ml.com/labml_nn/transformers/feedback) diff --git a/labml_nn/transformers/__init__.py b/labml_nn/transformers/__init__.py index 4d58bf42..1b63a55b 100644 --- a/labml_nn/transformers/__init__.py +++ b/labml_nn/transformers/__init__.py @@ -18,6 +18,10 @@ and derivatives and enhancements of it. * [Transformer Encoder and Decoder Models](models.html) * [Fixed positional encoding](positional_encoding.html) +## [GPT Architecture](gpt) + +This is an implementation of GPT-2 architecture. + ## [kNN-LM](knn) This is an implementation of the paper diff --git a/readme.md b/readme.md index c6c9d2b2..c9ae760e 100644 --- a/readme.md +++ b/readme.md @@ -27,6 +27,7 @@ contains implementations for and [relative multi-headed attention](https://lab-ml.com/labml_nn/transformers/relative_mha.html). +* [GPT Architecture](https://lab-ml.com/labml_nn/transformers/gpt) * [kNN-LM: Generalization through Memorization](https://lab-ml.com/labml_nn/transformers/knn) * [Feedback Transformer](https://lab-ml.com/labml_nn/transformers/feedback)