mirror of
https://github.com/labmlai/annotated_deep_learning_paper_implementations.git
synced 2025-08-06 15:22:21 +08:00
links
This commit is contained in:
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -78,6 +78,7 @@
|
|||||||
<li><a href="transformers/xl/index.html">Transformer XL</a> </li>
|
<li><a href="transformers/xl/index.html">Transformer XL</a> </li>
|
||||||
<li><a href="transformers/xl/relative_mha.html">Relative multi-headed attention</a> </li>
|
<li><a href="transformers/xl/relative_mha.html">Relative multi-headed attention</a> </li>
|
||||||
<li><a href="transformers/rope/index.html">Rotary Positional Embeddings</a> </li>
|
<li><a href="transformers/rope/index.html">Rotary Positional Embeddings</a> </li>
|
||||||
|
<li><a href="transformers/retro/index.html">RETRO</a> </li>
|
||||||
<li><a href="transformers/compressive/index.html">Compressive Transformer</a> </li>
|
<li><a href="transformers/compressive/index.html">Compressive Transformer</a> </li>
|
||||||
<li><a href="transformers/gpt/index.html">GPT Architecture</a> </li>
|
<li><a href="transformers/gpt/index.html">GPT Architecture</a> </li>
|
||||||
<li><a href="transformers/glu_variants/simple.html">GLU Variants</a> </li>
|
<li><a href="transformers/glu_variants/simple.html">GLU Variants</a> </li>
|
||||||
|
@ -211,7 +211,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://nn.labml.ai/experiments/nlp_autoregression.html</loc>
|
<loc>https://nn.labml.ai/experiments/nlp_autoregression.html</loc>
|
||||||
<lastmod>2022-03-06T16:30:00+00:00</lastmod>
|
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
|
||||||
<priority>1.00</priority>
|
<priority>1.00</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -267,7 +267,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://nn.labml.ai/distillation/small.html</loc>
|
<loc>https://nn.labml.ai/distillation/small.html</loc>
|
||||||
<lastmod>2022-03-06T16:30:00+00:00</lastmod>
|
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
|
||||||
<priority>1.00</priority>
|
<priority>1.00</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -582,7 +582,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://nn.labml.ai/transformers/rope/experiment.html</loc>
|
<loc>https://nn.labml.ai/transformers/rope/experiment.html</loc>
|
||||||
<lastmod>2022-03-06T16:30:00+00:00</lastmod>
|
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
|
||||||
<priority>1.00</priority>
|
<priority>1.00</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -596,7 +596,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://nn.labml.ai/transformers/basic/autoregressive_experiment.html</loc>
|
<loc>https://nn.labml.ai/transformers/basic/autoregressive_experiment.html</loc>
|
||||||
<lastmod>2022-03-06T16:30:00+00:00</lastmod>
|
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
|
||||||
<priority>1.00</priority>
|
<priority>1.00</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -722,7 +722,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://nn.labml.ai/transformers/retro/index.html</loc>
|
<loc>https://nn.labml.ai/transformers/retro/index.html</loc>
|
||||||
<lastmod>2022-03-10T16:30:00+00:00</lastmod>
|
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
|
||||||
<priority>1.00</priority>
|
<priority>1.00</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -883,7 +883,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://nn.labml.ai/graphs/gat/index.html</loc>
|
<loc>https://nn.labml.ai/graphs/gat/index.html</loc>
|
||||||
<lastmod>2021-08-19T16:30:00+00:00</lastmod>
|
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
|
||||||
<priority>1.00</priority>
|
<priority>1.00</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
@ -897,7 +897,7 @@
|
|||||||
|
|
||||||
<url>
|
<url>
|
||||||
<loc>https://nn.labml.ai/graphs/gatv2/index.html</loc>
|
<loc>https://nn.labml.ai/graphs/gatv2/index.html</loc>
|
||||||
<lastmod>2021-08-19T16:30:00+00:00</lastmod>
|
<lastmod>2022-03-12T16:30:00+00:00</lastmod>
|
||||||
<priority>1.00</priority>
|
<priority>1.00</priority>
|
||||||
</url>
|
</url>
|
||||||
|
|
||||||
|
@ -78,6 +78,8 @@
|
|||||||
<p>This implements Transformer XL model using <a href="xl/relative_mha.html">relative multi-head attention</a></p>
|
<p>This implements Transformer XL model using <a href="xl/relative_mha.html">relative multi-head attention</a></p>
|
||||||
<h2><a href="rope/index.html">Rotary Positional Embeddings</a></h2>
|
<h2><a href="rope/index.html">Rotary Positional Embeddings</a></h2>
|
||||||
<p>This implements Rotary Positional Embeddings (RoPE)</p>
|
<p>This implements Rotary Positional Embeddings (RoPE)</p>
|
||||||
|
<h2><a href="retro/index.html">RETRO</a></h2>
|
||||||
|
<p>This implements the Retrieval-Enhanced Transformer (RETRO).</p>
|
||||||
<h2><a href="compressive/index.html">Compressive Transformer</a></h2>
|
<h2><a href="compressive/index.html">Compressive Transformer</a></h2>
|
||||||
<p>This is an implementation of compressive transformer that extends upon <a href="xl/index.html">Transformer XL</a> by compressing oldest memories to give a longer attention span.</p>
|
<p>This is an implementation of compressive transformer that extends upon <a href="xl/index.html">Transformer XL</a> by compressing oldest memories to give a longer attention span.</p>
|
||||||
<h2><a href="gpt/index.html">GPT Architecture</a></h2>
|
<h2><a href="gpt/index.html">GPT Architecture</a></h2>
|
||||||
@ -111,10 +113,10 @@
|
|||||||
|
|
||||||
</div>
|
</div>
|
||||||
<div class='code'>
|
<div class='code'>
|
||||||
<div class="highlight"><pre><span class="lineno">106</span><span></span><span class="kn">from</span> <span class="nn">.configs</span> <span class="kn">import</span> <span class="n">TransformerConfigs</span>
|
<div class="highlight"><pre><span class="lineno">109</span><span></span><span class="kn">from</span> <span class="nn">.configs</span> <span class="kn">import</span> <span class="n">TransformerConfigs</span>
|
||||||
<span class="lineno">107</span><span class="kn">from</span> <span class="nn">.models</span> <span class="kn">import</span> <span class="n">TransformerLayer</span><span class="p">,</span> <span class="n">Encoder</span><span class="p">,</span> <span class="n">Decoder</span><span class="p">,</span> <span class="n">Generator</span><span class="p">,</span> <span class="n">EncoderDecoder</span>
|
<span class="lineno">110</span><span class="kn">from</span> <span class="nn">.models</span> <span class="kn">import</span> <span class="n">TransformerLayer</span><span class="p">,</span> <span class="n">Encoder</span><span class="p">,</span> <span class="n">Decoder</span><span class="p">,</span> <span class="n">Generator</span><span class="p">,</span> <span class="n">EncoderDecoder</span>
|
||||||
<span class="lineno">108</span><span class="kn">from</span> <span class="nn">.mha</span> <span class="kn">import</span> <span class="n">MultiHeadAttention</span>
|
<span class="lineno">111</span><span class="kn">from</span> <span class="nn">.mha</span> <span class="kn">import</span> <span class="n">MultiHeadAttention</span>
|
||||||
<span class="lineno">109</span><span class="kn">from</span> <span class="nn">labml_nn.transformers.xl.relative_mha</span> <span class="kn">import</span> <span class="n">RelativeMultiHeadAttention</span></pre></div>
|
<span class="lineno">112</span><span class="kn">from</span> <span class="nn">labml_nn.transformers.xl.relative_mha</span> <span class="kn">import</span> <span class="n">RelativeMultiHeadAttention</span></pre></div>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
<div class='footer'>
|
<div class='footer'>
|
||||||
|
@ -23,6 +23,7 @@ implementations.
|
|||||||
* [Transformer XL](transformers/xl/index.html)
|
* [Transformer XL](transformers/xl/index.html)
|
||||||
* [Relative multi-headed attention](transformers/xl/relative_mha.html)
|
* [Relative multi-headed attention](transformers/xl/relative_mha.html)
|
||||||
* [Rotary Positional Embeddings](transformers/rope/index.html)
|
* [Rotary Positional Embeddings](transformers/rope/index.html)
|
||||||
|
* [RETRO](transformers/retro/index.html)
|
||||||
* [Compressive Transformer](transformers/compressive/index.html)
|
* [Compressive Transformer](transformers/compressive/index.html)
|
||||||
* [GPT Architecture](transformers/gpt/index.html)
|
* [GPT Architecture](transformers/gpt/index.html)
|
||||||
* [GLU Variants](transformers/glu_variants/simple.html)
|
* [GLU Variants](transformers/glu_variants/simple.html)
|
||||||
|
@ -25,6 +25,9 @@ This implements Transformer XL model using
|
|||||||
## [Rotary Positional Embeddings](rope/index.html)
|
## [Rotary Positional Embeddings](rope/index.html)
|
||||||
This implements Rotary Positional Embeddings (RoPE)
|
This implements Rotary Positional Embeddings (RoPE)
|
||||||
|
|
||||||
|
## [RETRO](retro/index.html)
|
||||||
|
This implements the Retrieval-Enhanced Transformer (RETRO).
|
||||||
|
|
||||||
## [Compressive Transformer](compressive/index.html)
|
## [Compressive Transformer](compressive/index.html)
|
||||||
|
|
||||||
This is an implementation of compressive transformer
|
This is an implementation of compressive transformer
|
||||||
|
@ -25,6 +25,7 @@ implementations almost weekly.
|
|||||||
* [Transformer XL](https://nn.labml.ai/transformers/xl/index.html)
|
* [Transformer XL](https://nn.labml.ai/transformers/xl/index.html)
|
||||||
* [Relative multi-headed attention](https://nn.labml.ai/transformers/xl/relative_mha.html)
|
* [Relative multi-headed attention](https://nn.labml.ai/transformers/xl/relative_mha.html)
|
||||||
* [Rotary Positional Embeddings](https://nn.labml.ai/transformers/rope/index.html)
|
* [Rotary Positional Embeddings](https://nn.labml.ai/transformers/rope/index.html)
|
||||||
|
* [RETRO](https://nn.labml.ai/transformers/retro/index.html)
|
||||||
* [Compressive Transformer](https://nn.labml.ai/transformers/compressive/index.html)
|
* [Compressive Transformer](https://nn.labml.ai/transformers/compressive/index.html)
|
||||||
* [GPT Architecture](https://nn.labml.ai/transformers/gpt/index.html)
|
* [GPT Architecture](https://nn.labml.ai/transformers/gpt/index.html)
|
||||||
* [GLU Variants](https://nn.labml.ai/transformers/glu_variants/simple.html)
|
* [GLU Variants](https://nn.labml.ai/transformers/glu_variants/simple.html)
|
||||||
|
Reference in New Issue
Block a user