This commit is contained in:
Varuna Jayasiri
2022-03-12 15:51:10 +05:30
parent a7a7a3bdb7
commit 1536c6ec5e
8 changed files with 25 additions and 17 deletions

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -78,6 +78,7 @@
<li><a href="transformers/xl/index.html">Transformer XL</a> </li> <li><a href="transformers/xl/index.html">Transformer XL</a> </li>
<li><a href="transformers/xl/relative_mha.html">Relative multi-headed attention</a> </li> <li><a href="transformers/xl/relative_mha.html">Relative multi-headed attention</a> </li>
<li><a href="transformers/rope/index.html">Rotary Positional Embeddings</a> </li> <li><a href="transformers/rope/index.html">Rotary Positional Embeddings</a> </li>
<li><a href="transformers/retro/index.html">RETRO</a> </li>
<li><a href="transformers/compressive/index.html">Compressive Transformer</a> </li> <li><a href="transformers/compressive/index.html">Compressive Transformer</a> </li>
<li><a href="transformers/gpt/index.html">GPT Architecture</a> </li> <li><a href="transformers/gpt/index.html">GPT Architecture</a> </li>
<li><a href="transformers/glu_variants/simple.html">GLU Variants</a> </li> <li><a href="transformers/glu_variants/simple.html">GLU Variants</a> </li>

View File

@ -211,7 +211,7 @@
<url> <url>
<loc>https://nn.labml.ai/experiments/nlp_autoregression.html</loc> <loc>https://nn.labml.ai/experiments/nlp_autoregression.html</loc>
<lastmod>2022-03-06T16:30:00+00:00</lastmod> <lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority> <priority>1.00</priority>
</url> </url>
@ -267,7 +267,7 @@
<url> <url>
<loc>https://nn.labml.ai/distillation/small.html</loc> <loc>https://nn.labml.ai/distillation/small.html</loc>
<lastmod>2022-03-06T16:30:00+00:00</lastmod> <lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority> <priority>1.00</priority>
</url> </url>
@ -582,7 +582,7 @@
<url> <url>
<loc>https://nn.labml.ai/transformers/rope/experiment.html</loc> <loc>https://nn.labml.ai/transformers/rope/experiment.html</loc>
<lastmod>2022-03-06T16:30:00+00:00</lastmod> <lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority> <priority>1.00</priority>
</url> </url>
@ -596,7 +596,7 @@
<url> <url>
<loc>https://nn.labml.ai/transformers/basic/autoregressive_experiment.html</loc> <loc>https://nn.labml.ai/transformers/basic/autoregressive_experiment.html</loc>
<lastmod>2022-03-06T16:30:00+00:00</lastmod> <lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority> <priority>1.00</priority>
</url> </url>
@ -722,7 +722,7 @@
<url> <url>
<loc>https://nn.labml.ai/transformers/retro/index.html</loc> <loc>https://nn.labml.ai/transformers/retro/index.html</loc>
<lastmod>2022-03-10T16:30:00+00:00</lastmod> <lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority> <priority>1.00</priority>
</url> </url>
@ -883,7 +883,7 @@
<url> <url>
<loc>https://nn.labml.ai/graphs/gat/index.html</loc> <loc>https://nn.labml.ai/graphs/gat/index.html</loc>
<lastmod>2021-08-19T16:30:00+00:00</lastmod> <lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority> <priority>1.00</priority>
</url> </url>
@ -897,7 +897,7 @@
<url> <url>
<loc>https://nn.labml.ai/graphs/gatv2/index.html</loc> <loc>https://nn.labml.ai/graphs/gatv2/index.html</loc>
<lastmod>2021-08-19T16:30:00+00:00</lastmod> <lastmod>2022-03-12T16:30:00+00:00</lastmod>
<priority>1.00</priority> <priority>1.00</priority>
</url> </url>

View File

@ -78,6 +78,8 @@
<p>This implements Transformer XL model using <a href="xl/relative_mha.html">relative multi-head attention</a></p> <p>This implements Transformer XL model using <a href="xl/relative_mha.html">relative multi-head attention</a></p>
<h2><a href="rope/index.html">Rotary Positional Embeddings</a></h2> <h2><a href="rope/index.html">Rotary Positional Embeddings</a></h2>
<p>This implements Rotary Positional Embeddings (RoPE)</p> <p>This implements Rotary Positional Embeddings (RoPE)</p>
<h2><a href="retro/index.html">RETRO</a></h2>
<p>This implements the Retrieval-Enhanced Transformer (RETRO).</p>
<h2><a href="compressive/index.html">Compressive Transformer</a></h2> <h2><a href="compressive/index.html">Compressive Transformer</a></h2>
<p>This is an implementation of compressive transformer that extends upon <a href="xl/index.html">Transformer XL</a> by compressing oldest memories to give a longer attention span.</p> <p>This is an implementation of compressive transformer that extends upon <a href="xl/index.html">Transformer XL</a> by compressing oldest memories to give a longer attention span.</p>
<h2><a href="gpt/index.html">GPT Architecture</a></h2> <h2><a href="gpt/index.html">GPT Architecture</a></h2>
@ -111,10 +113,10 @@
</div> </div>
<div class='code'> <div class='code'>
<div class="highlight"><pre><span class="lineno">106</span><span></span><span class="kn">from</span> <span class="nn">.configs</span> <span class="kn">import</span> <span class="n">TransformerConfigs</span> <div class="highlight"><pre><span class="lineno">109</span><span></span><span class="kn">from</span> <span class="nn">.configs</span> <span class="kn">import</span> <span class="n">TransformerConfigs</span>
<span class="lineno">107</span><span class="kn">from</span> <span class="nn">.models</span> <span class="kn">import</span> <span class="n">TransformerLayer</span><span class="p">,</span> <span class="n">Encoder</span><span class="p">,</span> <span class="n">Decoder</span><span class="p">,</span> <span class="n">Generator</span><span class="p">,</span> <span class="n">EncoderDecoder</span> <span class="lineno">110</span><span class="kn">from</span> <span class="nn">.models</span> <span class="kn">import</span> <span class="n">TransformerLayer</span><span class="p">,</span> <span class="n">Encoder</span><span class="p">,</span> <span class="n">Decoder</span><span class="p">,</span> <span class="n">Generator</span><span class="p">,</span> <span class="n">EncoderDecoder</span>
<span class="lineno">108</span><span class="kn">from</span> <span class="nn">.mha</span> <span class="kn">import</span> <span class="n">MultiHeadAttention</span> <span class="lineno">111</span><span class="kn">from</span> <span class="nn">.mha</span> <span class="kn">import</span> <span class="n">MultiHeadAttention</span>
<span class="lineno">109</span><span class="kn">from</span> <span class="nn">labml_nn.transformers.xl.relative_mha</span> <span class="kn">import</span> <span class="n">RelativeMultiHeadAttention</span></pre></div> <span class="lineno">112</span><span class="kn">from</span> <span class="nn">labml_nn.transformers.xl.relative_mha</span> <span class="kn">import</span> <span class="n">RelativeMultiHeadAttention</span></pre></div>
</div> </div>
</div> </div>
<div class='footer'> <div class='footer'>

View File

@ -23,6 +23,7 @@ implementations.
* [Transformer XL](transformers/xl/index.html) * [Transformer XL](transformers/xl/index.html)
* [Relative multi-headed attention](transformers/xl/relative_mha.html) * [Relative multi-headed attention](transformers/xl/relative_mha.html)
* [Rotary Positional Embeddings](transformers/rope/index.html) * [Rotary Positional Embeddings](transformers/rope/index.html)
* [RETRO](transformers/retro/index.html)
* [Compressive Transformer](transformers/compressive/index.html) * [Compressive Transformer](transformers/compressive/index.html)
* [GPT Architecture](transformers/gpt/index.html) * [GPT Architecture](transformers/gpt/index.html)
* [GLU Variants](transformers/glu_variants/simple.html) * [GLU Variants](transformers/glu_variants/simple.html)

View File

@ -25,6 +25,9 @@ This implements Transformer XL model using
## [Rotary Positional Embeddings](rope/index.html) ## [Rotary Positional Embeddings](rope/index.html)
This implements Rotary Positional Embeddings (RoPE) This implements Rotary Positional Embeddings (RoPE)
## [RETRO](retro/index.html)
This implements the Retrieval-Enhanced Transformer (RETRO).
## [Compressive Transformer](compressive/index.html) ## [Compressive Transformer](compressive/index.html)
This is an implementation of compressive transformer This is an implementation of compressive transformer

View File

@ -25,6 +25,7 @@ implementations almost weekly.
* [Transformer XL](https://nn.labml.ai/transformers/xl/index.html) * [Transformer XL](https://nn.labml.ai/transformers/xl/index.html)
* [Relative multi-headed attention](https://nn.labml.ai/transformers/xl/relative_mha.html) * [Relative multi-headed attention](https://nn.labml.ai/transformers/xl/relative_mha.html)
* [Rotary Positional Embeddings](https://nn.labml.ai/transformers/rope/index.html) * [Rotary Positional Embeddings](https://nn.labml.ai/transformers/rope/index.html)
* [RETRO](https://nn.labml.ai/transformers/retro/index.html)
* [Compressive Transformer](https://nn.labml.ai/transformers/compressive/index.html) * [Compressive Transformer](https://nn.labml.ai/transformers/compressive/index.html)
* [GPT Architecture](https://nn.labml.ai/transformers/gpt/index.html) * [GPT Architecture](https://nn.labml.ai/transformers/gpt/index.html)
* [GLU Variants](https://nn.labml.ai/transformers/glu_variants/simple.html) * [GLU Variants](https://nn.labml.ai/transformers/glu_variants/simple.html)