diff --git a/docs/recurrent_highway_networks/index.html b/docs/recurrent_highway_networks/index.html
index a76784eb..19c22847 100644
--- a/docs/recurrent_highway_networks/index.html
+++ b/docs/recurrent_highway_networks/index.html
@@ -109,10 +109,10 @@ c_d^t &= \sigma(lin_{cs}^d(s_d^t))
 </p>
 <p>$\odot$ stands for element-wise multiplication.</p>
 <p>Here we have made a couple of changes to notations from the paper.
-To avoid confusion with time, the gate is represented with $g$,
+To avoid confusion with time, gate is represented with $g$,
 which was $t$ in the paper.
 To avoid confusion with multiple layers we use $d$ for depth and $D$ for
-total depth instead of $l$ and $L$ from paper.</p>
+total depth instead of $l$ and $L$ from the paper.</p>
 <p>We have also replaced the weight matrices and bias vectors from the equations with
 linear transforms, because that&rsquo;s how the implementation is going to look like.</p>
 <p>We implement weight tying, as described in paper, $c_d^t = 1 - g_d^t$.</p>
@@ -127,7 +127,7 @@ linear transforms, because that&rsquo;s how the implementation is going to look
                     <a href='#section-2'>#</a>
                 </div>
                 <p><code>input_size</code> is the feature length of the input and <code>hidden_size</code> is
-feature length of the cell.
+the feature length of the cell.
 <code>depth</code> is $D$.</p>
             </div>
             <div class='code'>
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index c8d5eea3..0eee6242 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -491,7 +491,7 @@
 
     <url>
       <loc>https://nn.labml.ai/recurrent_highway_networks/index.html</loc>
-      <lastmod>2021-02-08T16:30:00+00:00</lastmod>
+      <lastmod>2021-02-11T16:30:00+00:00</lastmod>
       <priority>1.00</priority>
     </url>