mirror of
				https://github.com/labmlai/annotated_deep_learning_paper_implementations.git
				synced 2025-11-01 03:43:09 +08:00 
			
		
		
		
	links
This commit is contained in:
		| @ -15,12 +15,9 @@ implementations. | ||||
|  | ||||
| #### ✨ [Transformers](transformers/index.html) | ||||
|  | ||||
| [Transformers module](transformers/index.html) | ||||
| contains implementations for | ||||
| [multi-headed attention](transformers/mha.html) | ||||
| and | ||||
| [relative multi-headed attention](transformers/relative_mha.html). | ||||
|  | ||||
| * [Multi-headed attention](transformers/mha.html) | ||||
| * [Transformer building blocks](transformers/models.html) | ||||
| * [Relative multi-headed attention](transformers/xl/relative_mha.html). | ||||
| * [GPT Architecture](transformers/gpt/index.html) | ||||
| * [GLU Variants](transformers/glu_variants/simple.html) | ||||
| * [kNN-LM: Generalization through Memorization](transformers/knn/index.html) | ||||
|  | ||||
| @ -14,7 +14,7 @@ from paper [Attention Is All You Need](https://arxiv.org/abs/1706.03762), | ||||
| and derivatives and enhancements of it. | ||||
|  | ||||
| * [Multi-head attention](mha.html) | ||||
| * [Relative multi-head attention](relative_mha.html) | ||||
| * [Relative multi-head attention](xl/relative_mha.html) | ||||
| * [Transformer Encoder and Decoder Models](models.html) | ||||
| * [Fixed positional encoding](positional_encoding.html) | ||||
|  | ||||
|  | ||||
		Reference in New Issue
	
	Block a user
	 Varuna Jayasiri
					Varuna Jayasiri