mirror of
				https://github.com/labmlai/annotated_deep_learning_paper_implementations.git
				synced 2025-10-31 18:58:43 +08:00 
			
		
		
		
	repo name
This commit is contained in:
		| @ -47,7 +47,7 @@ This is supposed to be more stable in standard transformer setups. | ||||
| Here are [the training code](experiment.html) and a notebook for training a compressive transformer | ||||
| model on the Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/compressive/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/compressive/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/0d9b5338726c11ebb7c80242ac1c0002) | ||||
| """ | ||||
|  | ||||
|  | ||||
| @ -20,8 +20,8 @@ | ||||
|         "id": "AYV_dMVDxyc2" | ||||
|       }, | ||||
|       "source": [ | ||||
|         "[](https://github.com/lab-ml/nn)\n", | ||||
|         "[](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/compressive/experiment.ipynb)                    \n", | ||||
|         "[](https://github.com/labmlai/annotated_deep_learning_paper_implementations)\n", | ||||
|         "[](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/compressive/experiment.ipynb)                    \n", | ||||
|         "\n", | ||||
|         "## Compressive Transformer\n", | ||||
|         "\n", | ||||
|  | ||||
| @ -39,5 +39,5 @@ This is supposed to be more stable in standard transformer setups. | ||||
| Here are [the training code](https://nn.labml.ai/transformers/compressive/experiment.html) and a notebook for training a compressive transformer | ||||
| model on the Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/compressive/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/compressive/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/0d9b5338726c11ebb7c80242ac1c0002) | ||||
|  | ||||
| @ -88,7 +88,7 @@ $\frac{1}{z^{(i)} \cdot \color{lightgreen}{\phi(q^{(i)})}}$ | ||||
| Here are [the training code](experiment.html) and a notebook for training a fast weights | ||||
|  transformer on the Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/fast_weights/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/fast_weights/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/928aadc0846c11eb85710242ac1c0002) | ||||
| """ | ||||
|  | ||||
|  | ||||
| @ -20,8 +20,8 @@ | ||||
|         "id": "AYV_dMVDxyc2" | ||||
|       }, | ||||
|       "source": [ | ||||
|         "[](https://github.com/lab-ml/nn)\n", | ||||
|         "[](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/fast_weights/experiment.ipynb)                    \n", | ||||
|         "[](https://github.com/labmlai/annotated_deep_learning_paper_implementations)\n", | ||||
|         "[](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/fast_weights/experiment.ipynb)                    \n", | ||||
|         "\n", | ||||
|         "## Fast Weights Transformer\n", | ||||
|         "\n", | ||||
|  | ||||
| @ -10,7 +10,7 @@ This trains a fast weights transformer model for auto-regression. | ||||
|  | ||||
| Here’s a Colab notebook for training a fast weights transformer on Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/fast_weights/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/fast_weights/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/928aadc0846c11eb85710242ac1c0002) | ||||
| """ | ||||
|  | ||||
|  | ||||
| @ -7,5 +7,5 @@ Here is the [annotated implementation](https://nn.labml.ai/transformers/fast_wei | ||||
| Here are [the training code](https://nn.labml.ai/transformers/fast_weights/experiment.html) | ||||
| and a notebook for training a fast weights transformer on the Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/fast_weights/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/fast_weights/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/928aadc0846c11eb85710242ac1c0002) | ||||
|  | ||||
| @ -36,7 +36,7 @@ We implemented a custom PyTorch function to improve performance. | ||||
|  | ||||
| Here's [the training code](experiment.html) and a notebook for training a feedback transformer on Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/feedback/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/feedback/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/d8eb9416530a11eb8fb50242ac1c0002) | ||||
| """ | ||||
|  | ||||
|  | ||||
| @ -6,8 +6,8 @@ | ||||
|     "id": "AYV_dMVDxyc2" | ||||
|    }, | ||||
|    "source": [ | ||||
|     "[](https://github.com/lab-ml/nn)\n", | ||||
|     "[](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/feedback/experiment.ipynb)                    \n", | ||||
|     "[](https://github.com/labmlai/annotated_deep_learning_paper_implementations)\n", | ||||
|     "[](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/feedback/experiment.ipynb)                    \n", | ||||
|     "\n", | ||||
|     "## Feedback Transformer\n", | ||||
|     "\n", | ||||
|  | ||||
| @ -12,7 +12,7 @@ where the keys and values are precalculated. | ||||
|  | ||||
| Here's a Colab notebook for training a feedback transformer on Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/feedback/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/feedback/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/d8eb9416530a11eb8fb50242ac1c0002) | ||||
| """ | ||||
|  | ||||
|  | ||||
| @ -29,7 +29,7 @@ We implemented a custom PyTorch function to improve performance. | ||||
|  | ||||
| Here's [the training code](experiment.html) and a notebook for training a feedback transformer on Tiny Shakespeare dataset. | ||||
|  | ||||
| [Colab Notebook](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/feedback/experiment.ipynb) | ||||
| [Colab Notebook](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/feedback/experiment.ipynb) | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/feedback/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/feedback/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/d8eb9416530a11eb8fb50242ac1c0002) | ||||
|  | ||||
| @ -21,8 +21,8 @@ | ||||
|     "id": "AYV_dMVDxyc2" | ||||
|    }, | ||||
|    "source": [ | ||||
|     "[](https://github.com/lab-ml/nn)\n", | ||||
|     "[](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/glu_variants/simple.ipynb)                    \n", | ||||
|     "[](https://github.com/labmlai/annotated_deep_learning_paper_implementations)\n", | ||||
|     "[](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/glu_variants/simple.ipynb)                    \n", | ||||
|     "\n", | ||||
|     "## Gated Linear Units and Variants\n", | ||||
|     "\n", | ||||
|  | ||||
| @ -14,7 +14,7 @@ We try different variants for the [position-wise feedforward network](../feed_fo | ||||
| *This is a simpler implementation that doesn't use [`labml.configs`](experiment.html) module. | ||||
| We decided to write a simpler implementation to make it easier for readers who are not familiar.* | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/glu_variants/simple.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/glu_variants/simple.ipynb) | ||||
| [](https://app.labml.ai/run/86b773f65fc911ebb2ac0242ac1c0002) | ||||
| """ | ||||
| import dataclasses | ||||
|  | ||||
| @ -28,7 +28,7 @@ For the transformer we reuse the | ||||
|  | ||||
| Here's a notebook for training a GPT model on Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/gpt/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/gpt/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/0324c6d0562111eba65d0242ac1c0002) | ||||
| """ | ||||
|  | ||||
|  | ||||
| @ -20,8 +20,8 @@ | ||||
|     "id": "AYV_dMVDxyc2" | ||||
|    }, | ||||
|    "source": [ | ||||
|     "[](https://github.com/lab-ml/nn)\n", | ||||
|     "[](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/gpt/experiment.ipynb)                    \n", | ||||
|     "[](https://github.com/labmlai/annotated_deep_learning_paper_implementations)\n", | ||||
|     "[](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/gpt/experiment.ipynb)                    \n", | ||||
|     "\n", | ||||
|     "## Training a model with GPT architecture\n", | ||||
|     "\n", | ||||
|  | ||||
| @ -33,7 +33,7 @@ discusses dropping tokens when routing is not balanced. | ||||
|  | ||||
| Here's [the training code](experiment.html) and a notebook for training a switch transformer on Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/switch/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/switch/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/c4656c605b9311eba13d0242ac1c0002) | ||||
| """ | ||||
|  | ||||
|  | ||||
| @ -20,8 +20,8 @@ | ||||
|         "id": "AYV_dMVDxyc2" | ||||
|       }, | ||||
|       "source": [ | ||||
|         "[](https://github.com/lab-ml/nn)\n", | ||||
|         "[](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/switch/experiment.ipynb)                    \n", | ||||
|         "[](https://github.com/labmlai/annotated_deep_learning_paper_implementations)\n", | ||||
|         "[](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/switch/experiment.ipynb)                    \n", | ||||
|         "\n", | ||||
|         "## Switch Transformer\n", | ||||
|         "\n", | ||||
|  | ||||
| @ -26,5 +26,5 @@ discusses dropping tokens when routing is not balanced. | ||||
|  | ||||
| Here's [the training code](experiment.html) and a notebook for training a switch transformer on Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/switch/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/switch/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/c4656c605b9311eba13d0242ac1c0002) | ||||
|  | ||||
| @ -28,7 +28,7 @@ Annotated implementation of relative multi-headed attention is in [`relative_mha | ||||
|  | ||||
| Here's [the training code](experiment.html) and a notebook for training a transformer XL model on Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/xl/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/xl/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/d3b6760c692e11ebb6a70242ac1c0002) | ||||
| """ | ||||
|  | ||||
|  | ||||
| @ -20,8 +20,8 @@ | ||||
|         "id": "AYV_dMVDxyc2" | ||||
|       }, | ||||
|       "source": [ | ||||
|         "[](https://github.com/lab-ml/nn)\n", | ||||
|         "[](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/xl/experiment.ipynb)                    \n", | ||||
|         "[](https://github.com/labmlai/annotated_deep_learning_paper_implementations)\n", | ||||
|         "[](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/xl/experiment.ipynb)                    \n", | ||||
|         "\n", | ||||
|         "## Transformer XL\n", | ||||
|         "\n", | ||||
|  | ||||
| @ -20,5 +20,5 @@ Annotated implementation of relative multi-headed attention is in [`relative_mha | ||||
|  | ||||
| Here's [the training code](https://nn.labml.ai/transformers/xl/experiment.html) and a notebook for training a transformer XL model on Tiny Shakespeare dataset. | ||||
|  | ||||
| [](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/transformers/xl/experiment.ipynb) | ||||
| [](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/transformers/xl/experiment.ipynb) | ||||
| [](https://app.labml.ai/run/d3b6760c692e11ebb6a70242ac1c0002) | ||||
|  | ||||
		Reference in New Issue
	
	Block a user
	 Varuna Jayasiri
					Varuna Jayasiri