mirror of
				https://github.com/labmlai/annotated_deep_learning_paper_implementations.git
				synced 2025-11-01 03:43:09 +08:00 
			
		
		
		
	fix \u
This commit is contained in:
		| @ -65,7 +65,7 @@ class MultiHeadAttention(Module): | ||||
|  | ||||
|         This computes scaled multi-headed attention for given `query`, `key` and `value` vectors. | ||||
|  | ||||
|         $$Attention(Q, K, V) = \underset{seq}{softmax}\Bigg(\frac{Q K^T}{\sqrt{d_k}}\Bigg)V$$ | ||||
|         $$Attention(Q, K, V) = \\underset{seq}{softmax}\Bigg(\frac{Q K^T}{\sqrt{d_k}}\Bigg)V$$ | ||||
|  | ||||
|         In simple terms, it finds keys that matches the query, and get the values of | ||||
|          those keys. | ||||
|  | ||||
		Reference in New Issue
	
	Block a user
	 Varuna Jayasiri
					Varuna Jayasiri