typo

2025-08-26 08:41:23 +08:00 · 2021-01-29 15:15:44 +05:30
parent a0fa963c60
commit 3161c23592
2 changed files with 2 additions and 2 deletions
--- a/docs/transformers/feedback/index.html
+++ b/docs/transformers/feedback/index.html
@ -90,7 +90,7 @@ Instead it keeps weighted sum of the output of all layers.
 This reduces the memory used for caching during prediction.
 The first half of this file implements this.</p>
 <p>The updated feedback transformer shares weights $W^l_k$ and $W^l_v$ used
-to calculate keys and values for among the layers.
+to calculate keys and values among the layers.
 We then calculate the keys and values for each step only once and keep
 them cached.
 The <a href="#shared_kv">second half</a> of this file implements this.
--- a/labml_nn/transformers/feedback/init.py
+++ b/labml_nn/transformers/feedback/init.py
@ -28,7 +28,7 @@ This reduces the memory used for caching during prediction.
 The first half of this file implements this.

 The updated feedback transformer shares weights $W^l_k$ and $W^l_v$ used
-to calculate keys and values for among the layers.
+to calculate keys and values among the layers.
 We then calculate the keys and values for each step only once and keep
 them cached.
 The [second half](#shared_kv) of this file implements this.