arxiv.org links

2025-08-26 08:41:23 +08:00 · 2023-10-24 14:42:32 +01:00
parent 1159ecfc63
commit 9a42ac2697
238 changed files with 354 additions and 353 deletions
--- a/docs/conv_mixer/index.html
+++ b/docs/conv_mixer/index.html
@ -71,7 +71,7 @@
                <a href='#section-0'>#</a>
            </div>
            <h1>Patches Are All You Need? (ConvMixer)</h1>
-<p>This is a <a href="https://pytorch.org">PyTorch</a> implementation of the paper <a href="https://papers.labml.ai/paper/2201.09792">Patches Are All You Need?</a>.</p>
+<p>This is a <a href="https://pytorch.org">PyTorch</a> implementation of the paper <a href="https://arxiv.org/abs/2201.09792">Patches Are All You Need?</a>.</p>
 <p><img alt="ConvMixer diagram from the paper" src="conv_mixer.png"></p>
 <p>ConvMixer is Similar to <a href="../transformers/mlp_mixer/index.html">MLP-Mixer</a>. MLP-Mixer separates mixing of spatial and channel dimensions, by applying an MLP across spatial dimension and then an MLP across the channel dimension (spatial MLP replaces the <a href="../transformers/vit/index.html">ViT</a> attention and channel MLP is the <a href="../transformers/feed_forward.html">FFN</a> of ViT).</p>
 <p>ConvMixer uses a <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span><span class="mord coloredeq eqa" style=""><span class="mord" style="">1</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin" style="">×</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mord" style="">1</span></span></span></span></span></span> convolution for channel mixing and a depth-wise convolution for spatial mixing. Since it&#x27;s a convolution instead of a full MLP across the space, it mixes only the nearby batches in contrast to ViT or MLP-Mixer. Also, the MLP-mixer uses MLPs of two layers for each mixing and ConvMixer uses a single layer for each mixing.</p>
--- a/docs/conv_mixer/readme.html
+++ b/docs/conv_mixer/readme.html
@ -71,7 +71,7 @@
                <a href='#section-0'>#</a>
            </div>
            <h1><a href="https://nn.labml.ai/conv_mixer/index.html">Patches Are All You Need?</a></h1>
-<p>This is a <a href="https://pytorch.org">PyTorch</a> implementation of the paper <a href="https://papers.labml.ai/paper/2201.09792">Patches Are All You Need?</a>.</p>
+<p>This is a <a href="https://pytorch.org">PyTorch</a> implementation of the paper <a href="https://arxiv.org/abs/2201.09792">Patches Are All You Need?</a>.</p>
 <p>ConvMixer is Similar to <a href="https://nn.labml.ai/transformers/mlp_mixer/index.html">MLP-Mixer</a>. MLP-Mixer separates mixing of spatial and channel dimensions, by applying an MLP across spatial dimension and then an MLP across the channel dimension (spatial MLP replaces the <a href="https://nn.labml.ai/transformers/vit/index.html">ViT</a> attention and channel MLP is the <a href="https://nn.labml.ai/transformers/feed_forward.html">FFN</a> of ViT).</p>
 <p>ConvMixer uses a 1x1 convolution for channel mixing and a depth-wise convolution for spatial mixing. Since it&#x27;s a convolution instead of a full MLP across the space, it mixes only the nearby batches in contrast to ViT or MLP-Mixer. Also, the MLP-mixer uses MLPs of two layers for each mixing and ConvMixer uses a single layer for each mixing.</p>
 <p>The paper recommends removing the residual connection across the channel mixing (point-wise convolution) and having only a residual connection over the spatial mixing (depth-wise convolution). They also use <a href="https://nn.labml.ai/normalization/batch_norm/index.html">Batch normalization</a> instead of <a href="../normalization/layer_norm/index.html">Layer normalization</a>.</p>