Files
Varuna Jayasiri 465d01ef77 html template
2022-07-18 09:01:01 +05:30

236 lines
16 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<meta name="description" content=""/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:image:src" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
<meta name="twitter:title" content="labml.ai Annotated PyTorch Paper Implementations"/>
<meta name="twitter:description" content=""/>
<meta name="twitter:site" content="@labmlai"/>
<meta name="twitter:creator" content="@labmlai"/>
<meta property="og:url" content="https://nn.labml.ai/index.html"/>
<meta property="og:title" content="labml.ai Annotated PyTorch Paper Implementations"/>
<meta property="og:image" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
<meta property="og:site_name" content="labml.ai Annotated PyTorch Paper Implementations"/>
<meta property="og:type" content="object"/>
<meta property="og:title" content="labml.ai Annotated PyTorch Paper Implementations"/>
<meta property="og:description" content=""/>
<title>labml.ai Annotated PyTorch Paper Implementations</title>
<link rel="shortcut icon" href="/icon.png"/>
<link rel="stylesheet" href="./pylit.css?v=1">
<link rel="canonical" href="https://nn.labml.ai/index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.13.18/dist/katex.min.css" integrity="sha384-zTROYFVGOfTw7JV7KUu8udsvW2fx4lWOsCEDqhBreBwlHI4ioVRtmIvEThzJHGET" crossorigin="anonymous">
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-4V3HC8HBLH"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'G-4V3HC8HBLH');
</script>
</head>
<body>
<div id='container'>
<div id="background"></div>
<div class='section'>
<div class='docs'>
<p>
<a class="parent" href="/">home</a>
</p>
<p>
<a href="https://github.com/sponsors/labmlai" target="_blank">
<img alt="Sponsor"
src="https://img.shields.io/static/v1?label=Sponsor&message=%E2%9D%A4&logo=GitHub&color=%23fe8e86"
style="max-width:100%;"/></a>
<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations" target="_blank">
<img alt="Github"
src="https://img.shields.io/github/stars/labmlai/annotated_deep_learning_paper_implementations?style=social"
style="max-width:100%;"/></a>
<a href="https://twitter.com/labmlai" rel="nofollow" target="_blank">
<img alt="Twitter"
src="https://img.shields.io/twitter/follow/labmlai?style=social"
style="max-width:100%;"/></a>
</p>
<p>
<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/__init__.py" target="_blank">
View code on Github</a>
</p>
</div>
</div>
<div class='section' id='section-0'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-0'>#</a>
</div>
<h1><a href="index.html">labml.ai Annotated PyTorch Paper Implementations</a></h1>
<p>This is a collection of simple PyTorch implementations of neural networks and related algorithms. <a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations">These implementations</a> are documented with explanations, and the <a href="index.html">website</a> renders these as side-by-side formatted notes. We believe these would help you understand these algorithms better.</p>
<p><img alt="Screenshot" src="dqn-light.png"></p>
<p>We are actively maintaining this repo and adding new implementations. <a href="https://twitter.com/labmlai"><img alt="Twitter" src="https://img.shields.io/twitter/follow/labmlai?style=social"></a> for updates.</p>
<h2>Paper Implementations</h2>
<h4><a href="transformers/index.html">Transformers</a></h4>
<ul><li><a href="transformers/mha.html">Multi-headed attention</a> </li>
<li><a href="transformers/models.html">Transformer building blocks</a> </li>
<li><a href="transformers/xl/index.html">Transformer XL</a> </li>
<li><a href="transformers/xl/relative_mha.html">Relative multi-headed attention</a> </li>
<li><a href="transformers/rope/index.html">Rotary Positional Embeddings (RoPE)</a> </li>
<li><a href="transformers/alibi/index.html">Attention with Linear Biases (ALiBi)</a> </li>
<li><a href="transformers/retro/index.html">RETRO</a> </li>
<li><a href="transformers/compressive/index.html">Compressive Transformer</a> </li>
<li><a href="transformers/gpt/index.html">GPT Architecture</a> </li>
<li><a href="transformers/glu_variants/simple.html">GLU Variants</a> </li>
<li><a href="transformers/knn/index.html">kNN-LM: Generalization through Memorization</a> </li>
<li><a href="transformers/feedback/index.html">Feedback Transformer</a> </li>
<li><a href="transformers/switch/index.html">Switch Transformer</a> </li>
<li><a href="transformers/fast_weights/index.html">Fast Weights Transformer</a> </li>
<li><a href="transformers/fnet/index.html">FNet</a> </li>
<li><a href="transformers/aft/index.html">Attention Free Transformer</a> </li>
<li><a href="transformers/mlm/index.html">Masked Language Model</a> </li>
<li><a href="transformers/mlp_mixer/index.html">MLP-Mixer: An all-MLP Architecture for Vision</a> </li>
<li><a href="transformers/gmlp/index.html">Pay Attention to MLPs (gMLP)</a> </li>
<li><a href="transformers/vit/index.html">Vision Transformer (ViT)</a> </li>
<li><a href="transformers/primer_ez/index.html">Primer EZ</a> </li>
<li><a href="transformers/hour_glass/index.html">Hourglass</a></li></ul>
<h4><a href="recurrent_highway_networks/index.html">Recurrent Highway Networks</a></h4>
<h4><a href="lstm/index.html">LSTM</a></h4>
<h4><a href="hypernetworks/hyper_lstm.html">HyperNetworks - HyperLSTM</a></h4>
<h4><a href="resnet/index.html">ResNet</a></h4>
<h4><a href="conv_mixer/index.html">ConvMixer</a></h4>
<h4><a href="capsule_networks/index.html">Capsule Networks</a></h4>
<h4><a href="gan/index.html">Generative Adversarial Networks</a></h4>
<ul><li><a href="gan/original/index.html">Original GAN</a> </li>
<li><a href="gan/dcgan/index.html">GAN with deep convolutional network</a> </li>
<li><a href="gan/cycle_gan/index.html">Cycle GAN</a> </li>
<li><a href="gan/wasserstein/index.html">Wasserstein GAN</a> </li>
<li><a href="gan/wasserstein/gradient_penalty/index.html">Wasserstein GAN with Gradient Penalty</a> </li>
<li><a href="gan/stylegan/index.html">StyleGAN 2</a></li></ul>
<h4><a href="diffusion/index.html">Diffusion models</a></h4>
<ul><li><a href="diffusion/ddpm/index.html">Denoising Diffusion Probabilistic Models (DDPM)</a></li></ul>
<h4><a href="sketch_rnn/index.html">Sketch RNN</a></h4>
<h4>✨ Graph Neural Networks</h4>
<ul><li><a href="graphs/gat/index.html">Graph Attention Networks (GAT)</a> </li>
<li><a href="graphs/gatv2/index.html">Graph Attention Networks v2 (GATv2)</a></li></ul>
<h4><a href="cfr/index.html">Counterfactual Regret Minimization (CFR)</a></h4>
<p>Solving games with incomplete information such as poker with CFR.</p>
<ul><li><a href="cfr/kuhn/index.html">Kuhn Poker</a></li></ul>
<h4><a href="rl/index.html">Reinforcement Learning</a></h4>
<ul><li><a href="rl/ppo/index.html">Proximal Policy Optimization</a> with <a href="rl/ppo/gae.html">Generalized Advantage Estimation</a> </li>
<li><a href="rl/dqn/index.html">Deep Q Networks</a> with with <a href="rl/dqn/model.html">Dueling Network</a>, <a href="rl/dqn/replay_buffer.html">Prioritized Replay</a> and Double Q Network.</li></ul>
<h4><a href="optimizers/index.html">Optimizers</a></h4>
<ul><li><a href="optimizers/adam.html">Adam</a> </li>
<li><a href="optimizers/amsgrad.html">AMSGrad</a> </li>
<li><a href="optimizers/adam_warmup.html">Adam Optimizer with warmup</a> </li>
<li><a href="optimizers/noam.html">Noam Optimizer</a> </li>
<li><a href="optimizers/radam.html">Rectified Adam Optimizer</a> </li>
<li><a href="optimizers/ada_belief.html">AdaBelief Optimizer</a></li></ul>
<h4><a href="normalization/index.html">Normalization Layers</a></h4>
<ul><li><a href="normalization/batch_norm/index.html">Batch Normalization</a> </li>
<li><a href="normalization/layer_norm/index.html">Layer Normalization</a> </li>
<li><a href="normalization/instance_norm/index.html">Instance Normalization</a> </li>
<li><a href="normalization/group_norm/index.html">Group Normalization</a> </li>
<li><a href="normalization/weight_standardization/index.html">Weight Standardization</a> </li>
<li><a href="normalization/batch_channel_norm/index.html">Batch-Channel Normalization</a> </li>
<li><a href="normalization/deep_norm/index.html">DeepNorm</a></li></ul>
<h4><a href="distillation/index.html">Distillation</a></h4>
<h4><a href="adaptive_computation/index.html">Adaptive Computation</a></h4>
<ul><li><a href="adaptive_computation/ponder_net/index.html">PonderNet</a></li></ul>
<h4><a href="uncertainty/index.html">Uncertainty</a></h4>
<ul><li><a href="uncertainty/evidence/index.html">Evidential Deep Learning to Quantify Classification Uncertainty</a></li></ul>
<h4><a href="activations/index.html">Activations</a></h4>
<ul><li><a href="activations/fta/index.html">Fuzzy Tiling Activations</a></li></ul>
<h2>Highlighted Research Paper PDFs</h2>
<ul><li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2204.10628.pdf">Autoregressive Search Engines: Generating Substrings as Document Identifiers</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2203.15556.pdf">Training Compute-Optimal Large Language Models</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/1910.02054.pdf">ZeRO: Memory Optimizations Toward Training Trillion Parameter Models</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2204.02311.pdf">PaLM: Scaling Language Modeling with Pathways</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/dall-e-2.pdf">Hierarchical Text-Conditional Image Generation with CLIP Latents</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2203.14465.pdf">STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2112.04426.pdf">Improving language models by retrieving from trillions of tokens</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2003.08934.pdf">NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/1706.03762.pdf">Attention Is All You Need</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2006.11239.pdf">Denoising Diffusion Probabilistic Models</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2109.08668.pdf">Primer: Searching for Efficient Transformers for Language Modeling</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/1803.02999.pdf">On First-Order Meta-Learning Algorithms</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2103.00020.pdf">Learning Transferable Visual Models From Natural Language Supervision</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2109.02869.pdf">The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/1805.09801.pdf">Meta-Gradient Reinforcement Learning</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/google_maps_eta.pdf">ETA Prediction with Graph Neural Networks in Google Maps</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/ponder_net.pdf">PonderNet: Learning to Ponder</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/muzero.pdf">Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/gans_n_roses.pdf">GANs N Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/vit.pdf">An Image is Worth 16X16 Word: Transformers for Image Recognition at Scale</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/resnet.pdf">Deep Residual Learning for Image Recognition</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/distillation.pdf">Distilling the Knowledge in a Neural Network</a></li></ul>
<h3>Installation</h3>
<pre class="highlight lang-bash"><code><span></span>pip install labml-nn</code></pre>
<h3>Citing LabML</h3>
<p>If you use this for academic research, please cite it using the following BibTeX entry.</p>
<pre class="highlight lang-bibtex"><code><span></span><span class="nc">@misc</span><span class="p">{</span><span class="nl">labml</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="na">author</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">{Varuna Jayasiri, Nipun Wijerathne}</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="na">title</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">{labml.ai Annotated Paper Implementations}</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="na">year</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">{2020}</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="na">url</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">{}</span><span class="p">,</span><span class="w"></span>
<span class="p">}</span><span class="w"></span></code></pre>
</div>
<div class='code'>
<div class="highlight"><pre></pre></div>
</div>
</div>
<div class='footer'>
<a href="https://papers.labml.ai">Trending Research Papers</a>
<a href="https://labml.ai">labml.ai</a>
</div>
</div>
<script src=./interactive.js?v=1"></script>
<script>
function handleImages() {
var images = document.querySelectorAll('p>img')
for (var i = 0; i < images.length; ++i) {
handleImage(images[i])
}
}
function handleImage(img) {
img.parentElement.style.textAlign = 'center'
var modal = document.createElement('div')
modal.id = 'modal'
var modalContent = document.createElement('div')
modal.appendChild(modalContent)
var modalImage = document.createElement('img')
modalContent.appendChild(modalImage)
var span = document.createElement('span')
span.classList.add('close')
span.textContent = 'x'
modal.appendChild(span)
img.onclick = function () {
console.log('clicked')
document.body.appendChild(modal)
modalImage.src = img.src
}
span.onclick = function () {
document.body.removeChild(modal)
}
}
handleImages()
</script>
</body>
</html>