This commit is contained in:
Varuna Jayasiri
2022-03-12 15:51:10 +05:30
parent a7a7a3bdb7
commit 1536c6ec5e
8 changed files with 25 additions and 17 deletions

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -78,6 +78,7 @@
<li><a href="transformers/xl/index.html">Transformer XL</a> </li>
<li><a href="transformers/xl/relative_mha.html">Relative multi-headed attention</a> </li>
<li><a href="transformers/rope/index.html">Rotary Positional Embeddings</a> </li>
<li><a href="transformers/retro/index.html">RETRO</a> </li>
<li><a href="transformers/compressive/index.html">Compressive Transformer</a> </li>
<li><a href="transformers/gpt/index.html">GPT Architecture</a> </li>
<li><a href="transformers/glu_variants/simple.html">GLU Variants</a> </li>

View File

@ -211,7 +211,7 @@
<url>
<loc>https://nn.labml.ai/experiments/nlp_autoregression.html</loc>
<lastmod>2022-03-06T16:30:00+00:00</lastmod>
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>
@ -267,7 +267,7 @@
<url>
<loc>https://nn.labml.ai/distillation/small.html</loc>
<lastmod>2022-03-06T16:30:00+00:00</lastmod>
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>
@ -582,7 +582,7 @@
<url>
<loc>https://nn.labml.ai/transformers/rope/experiment.html</loc>
<lastmod>2022-03-06T16:30:00+00:00</lastmod>
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>
@ -596,7 +596,7 @@
<url>
<loc>https://nn.labml.ai/transformers/basic/autoregressive_experiment.html</loc>
<lastmod>2022-03-06T16:30:00+00:00</lastmod>
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>
@ -722,7 +722,7 @@
<url>
<loc>https://nn.labml.ai/transformers/retro/index.html</loc>
<lastmod>2022-03-10T16:30:00+00:00</lastmod>
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>
@ -883,7 +883,7 @@
<url>
<loc>https://nn.labml.ai/graphs/gat/index.html</loc>
<lastmod>2021-08-19T16:30:00+00:00</lastmod>
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>
@ -897,7 +897,7 @@
<url>
<loc>https://nn.labml.ai/graphs/gatv2/index.html</loc>
<lastmod>2021-08-19T16:30:00+00:00</lastmod>
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>

View File

@ -78,6 +78,8 @@
<p>This implements Transformer XL model using <a href="xl/relative_mha.html">relative multi-head attention</a></p>
<h2><a href="rope/index.html">Rotary Positional Embeddings</a></h2>
<p>This implements Rotary Positional Embeddings (RoPE)</p>
<h2><a href="retro/index.html">RETRO</a></h2>
<p>This implements the Retrieval-Enhanced Transformer (RETRO).</p>
<h2><a href="compressive/index.html">Compressive Transformer</a></h2>
<p>This is an implementation of compressive transformer that extends upon <a href="xl/index.html">Transformer XL</a> by compressing oldest memories to give a longer attention span.</p>
<h2><a href="gpt/index.html">GPT Architecture</a></h2>
@ -111,10 +113,10 @@
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">106</span><span></span><span class="kn">from</span> <span class="nn">.configs</span> <span class="kn">import</span> <span class="n">TransformerConfigs</span>
<span class="lineno">107</span><span class="kn">from</span> <span class="nn">.models</span> <span class="kn">import</span> <span class="n">TransformerLayer</span><span class="p">,</span> <span class="n">Encoder</span><span class="p">,</span> <span class="n">Decoder</span><span class="p">,</span> <span class="n">Generator</span><span class="p">,</span> <span class="n">EncoderDecoder</span>
<span class="lineno">108</span><span class="kn">from</span> <span class="nn">.mha</span> <span class="kn">import</span> <span class="n">MultiHeadAttention</span>
<span class="lineno">109</span><span class="kn">from</span> <span class="nn">labml_nn.transformers.xl.relative_mha</span> <span class="kn">import</span> <span class="n">RelativeMultiHeadAttention</span></pre></div>
<div class="highlight"><pre><span class="lineno">109</span><span></span><span class="kn">from</span> <span class="nn">.configs</span> <span class="kn">import</span> <span class="n">TransformerConfigs</span>
<span class="lineno">110</span><span class="kn">from</span> <span class="nn">.models</span> <span class="kn">import</span> <span class="n">TransformerLayer</span><span class="p">,</span> <span class="n">Encoder</span><span class="p">,</span> <span class="n">Decoder</span><span class="p">,</span> <span class="n">Generator</span><span class="p">,</span> <span class="n">EncoderDecoder</span>
<span class="lineno">111</span><span class="kn">from</span> <span class="nn">.mha</span> <span class="kn">import</span> <span class="n">MultiHeadAttention</span>
<span class="lineno">112</span><span class="kn">from</span> <span class="nn">labml_nn.transformers.xl.relative_mha</span> <span class="kn">import</span> <span class="n">RelativeMultiHeadAttention</span></pre></div>
</div>
</div>
<div class='footer'>

View File

@ -23,6 +23,7 @@ implementations.
* [Transformer XL](transformers/xl/index.html)
* [Relative multi-headed attention](transformers/xl/relative_mha.html)
* [Rotary Positional Embeddings](transformers/rope/index.html)
* [RETRO](transformers/retro/index.html)
* [Compressive Transformer](transformers/compressive/index.html)
* [GPT Architecture](transformers/gpt/index.html)
* [GLU Variants](transformers/glu_variants/simple.html)

View File

@ -25,6 +25,9 @@ This implements Transformer XL model using
## [Rotary Positional Embeddings](rope/index.html)
This implements Rotary Positional Embeddings (RoPE)
## [RETRO](retro/index.html)
This implements the Retrieval-Enhanced Transformer (RETRO).
## [Compressive Transformer](compressive/index.html)
This is an implementation of compressive transformer

View File

@ -25,6 +25,7 @@ implementations almost weekly.
* [Transformer XL](https://nn.labml.ai/transformers/xl/index.html)
* [Relative multi-headed attention](https://nn.labml.ai/transformers/xl/relative_mha.html)
* [Rotary Positional Embeddings](https://nn.labml.ai/transformers/rope/index.html)
* [RETRO](https://nn.labml.ai/transformers/retro/index.html)
* [Compressive Transformer](https://nn.labml.ai/transformers/compressive/index.html)
* [GPT Architecture](https://nn.labml.ai/transformers/gpt/index.html)
* [GLU Variants](https://nn.labml.ai/transformers/glu_variants/simple.html)