Files
Varuna Jayasiri ce0cdb676f ja translation
2023-06-30 16:03:07 +05:30

251 lines
17 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="zh">
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<meta name="description" content=""/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:image:src" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
<meta name="twitter:title" content="labml.ai 带注释的 pyTorch 论文实现"/>
<meta name="twitter:description" content=""/>
<meta name="twitter:site" content="@labmlai"/>
<meta name="twitter:creator" content="@labmlai"/>
<meta property="og:url" content="https://nn.labml.ai/index.html"/>
<meta property="og:title" content="labml.ai 带注释的 pyTorch 论文实现"/>
<meta property="og:image" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
<meta property="og:site_name" content="labml.ai 带注释的 pyTorch 论文实现"/>
<meta property="og:type" content="object"/>
<meta property="og:title" content="labml.ai 带注释的 pyTorch 论文实现"/>
<meta property="og:description" content=""/>
<title>labml.ai 带注释的 pyTorch 论文实现</title>
<link rel="shortcut icon" href="/icon.png"/>
<link rel="stylesheet" href="./pylit.css?v=1">
<link rel="canonical" href="https://nn.labml.ai/index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.13.18/dist/katex.min.css" integrity="sha384-zTROYFVGOfTw7JV7KUu8udsvW2fx4lWOsCEDqhBreBwlHI4ioVRtmIvEThzJHGET" crossorigin="anonymous">
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-4V3HC8HBLH"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'G-4V3HC8HBLH');
</script>
</head>
<body>
<div id='container'>
<div id="background"></div>
<div class='section'>
<div class='docs'>
<p>
<a class="parent" href="/">home</a>
</p>
<p>
<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations" target="_blank">
<img alt="Github"
src="https://img.shields.io/github/stars/labmlai/annotated_deep_learning_paper_implementations?style=social"
style="max-width:100%;"/></a>
<a href="https://twitter.com/labmlai" rel="nofollow" target="_blank">
<img alt="Twitter"
src="https://img.shields.io/twitter/follow/labmlai?style=social"
style="max-width:100%;"/></a>
</p>
<p>
<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/__init__.py" target="_blank">
View code on Github</a>
</p>
</div>
</div>
<div class='section' id='section-0'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-0'>#</a>
</div>
<h1><a href="index.html">labml.ai 带注释的 pyTorch 论文实现</a></h1>
<p>这是神经网络和相关算法的简单 PyTorch 实现的集合。<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations">这些实现</a>与解释一起记录,<a href="index.html">网站将这些内容</a>呈现为并排格式的注释。我们相信这些将帮助您更好地理解这些算法。</p>
<p><img alt="Screenshot" src="dqn-light.png"></p>
<p>我们正在积极维护这个仓库并添加新的实现。<a href="https://twitter.com/labmlai"><img alt="Twitter" src="https://img.shields.io/twitter/follow/labmlai?style=social"></a>以获取更新。</p>
<h2>翻译</h2>
<h3><strong><a href="https://nn.labml.ai">英语(原版)</a></strong></h3>
</a><h3><strong><a href="https://nn.labml.ai/zh/">中文(翻译)</strong></h3>
</a><h3><strong><a href="https://nn.labml.ai/ja/">日语(已翻译)</strong></h3>
<h2>纸质实现</h2>
<h4><a href="transformers/index.html">变形金刚</a></h4>
<ul><li><a href="transformers/mha.html">多头关注</a></li>
<li><a href="transformers/models.html">变压器积木</a></li>
<li><a href="transformers/xl/index.html">变压器 XL</a></li>
<li><a href="transformers/xl/relative_mha.html">相对多头的注意力</a></li>
<li><a href="transformers/rope/index.html">旋转位置嵌入 (ROPE)</a></li>
<li><a href="transformers/alibi/index.html">注意线性偏差 (AliBI)</a></li>
<li><a href="transformers/retro/index.html">复古</a></li>
<li><a href="transformers/compressive/index.html">压缩变压器</a></li>
<li><a href="transformers/gpt/index.html">GPT 架构</a></li>
<li><a href="transformers/glu_variants/simple.html">GLU 变体</a></li>
<li><a href="transformers/knn/index.html">knn-LM通过记忆进行泛化</a></li>
<li><a href="transformers/feedback/index.html">反馈变压器</a></li>
<li><a href="transformers/switch/index.html">开关变压器</a></li>
<li><a href="transformers/fast_weights/index.html">快速重量变压器</a></li>
<li><a href="transformers/fnet/index.html">FNet</a></li>
<li><a href="transformers/aft/index.html">免注意变压器</a></li>
<li><a href="transformers/mlm/index.html">屏蔽语言模型</a></li>
<li><a href="transformers/mlp_mixer/index.html">MLP 混音器:面向视觉的全 MLP 架构</a></li>
<li><a href="transformers/gmlp/index.html">注意 MLP (gMLP)</a></li>
<li><a href="transformers/vit/index.html">视觉变压器 (ViT)</a></li>
<li><a href="transformers/primer_ez/index.html">Primer</a></li>
<li><a href="transformers/hour_glass/index.html">沙漏</a></li></ul>
<h4><a href="neox/index.html">Eleuther GPT-neox</a></h4>
<li><a href="neox/samples/generate.html">在 48GB GPU 上生成</a></li> <ul>
<li><a href="neox/samples/finetune.html">两个 48GB GPU 上的 Finetune</a></li>
<li><a href="neox/utils/llm_int8.html">llm.int8 ()</a></li></ul>
<h4><a href="diffusion/index.html">扩散模型</a></h4>
<ul><li><a href="diffusion/ddpm/index.html">去噪扩散概率模型 (DDPM)</a></li>
<li><a href="diffusion/stable_diffusion/sampler/ddim.html">降噪扩散隐含模型 (DDIM)</a></li>
<li><a href="diffusion/stable_diffusion/latent_diffusion.html">潜在扩散模型</a></li>
<li><a href="diffusion/stable_diffusion/index.html">稳定的扩散</a></li></ul>
<h4><a href="gan/index.html">生成对抗网络</a></h4>
<ul><li><a href="gan/original/index.html">原装 GAN</a></li>
<li><a href="gan/dcgan/index.html">具有深度卷积网络的 GAN</a></li>
<li><a href="gan/cycle_gan/index.html">循环增益</a></li>
<li><a href="gan/wasserstein/index.html">Wasserstein GAN</a></li>
<li><a href="gan/wasserstein/gradient_penalty/index.html">Wasserstein GAN 带梯度惩罚</a></li>
<li><a href="gan/stylegan/index.html">StyleGan 2</a></li></ul>
<h4><a href="recurrent_highway_networks/index.html">循环高速公路网络</a></h4>
<h4><a href="lstm/index.html">LSTM</a></h4>
<h4><a href="hypernetworks/hyper_lstm.html">超级网络-HyperLSTM</a></h4>
<h4><a href="resnet/index.html">ResNet</a></h4>
<h4><a href="conv_mixer/index.html">混音器</a></h4>
<h4><a href="capsule_networks/index.html">胶囊网络</a></h4>
<h4><a href="unet/index.html">U-Net</a></h4>
<h4><a href="sketch_rnn/index.html">素描 RNN</a></h4>
<h4>✨ 图形神经网络</h4>
<ul><li><a href="graphs/gat/index.html">图关注网络 (GAT)</a></li>
<li><a href="graphs/gatv2/index.html">Graph 注意力网络 v2 (GATv2)</a></li></ul>
<h4><a href="rl/index.html">强化学习</a></h4>
<li>基于<a href="rl/ppo/gae.html">广义<a href="rl/ppo/index.html">优势估计的近端策略</a></a></li> <ul>
D@@ <li><a href="rl/dqn/index.html">eep Q Network</a> s 带有<a href="rl/dqn/model.html">决斗网络</a><a href="rl/dqn/replay_buffer.html">优先重播</a>和 Double Q Network。</li></ul>
<h4><a href="cfr/index.html">反事实遗憾最小化CFR</a></h4>
<p>使用CFR解决信息不完整的游戏例如使用CFR的扑克。</p>
<ul><li><a href="cfr/kuhn/index.html">库恩扑克</a></li></ul>
<h4><a href="optimizers/index.html">优化器</a></h4>
<ul><li><a href="optimizers/adam.html">亚当</a></li>
<li><a href="optimizers/amsgrad.html">阿姆斯格拉德</a></li>
<li><a href="optimizers/adam_warmup.html">Adam Optimizer 带热身</a></li>
<li><a href="optimizers/noam.html">Noam 优化器</a></li>
<li><a href="optimizers/radam.html">纠正亚当优化器</a></li>
<li><a href="optimizers/ada_belief.html">adaBelief 优化器</a></li></ul>
<h4><a href="normalization/index.html">规范化层</a></h4>
<ul><li><a href="normalization/batch_norm/index.html">批量标准化</a></li>
<li><a href="normalization/layer_norm/index.html">层规范化</a></li>
<li><a href="normalization/instance_norm/index.html">实例规范化</a></li>
<li><a href="normalization/group_norm/index.html">群组规范化</a></li>
<li><a href="normalization/weight_standardization/index.html">重量标准化</a></li>
<li><a href="normalization/batch_channel_norm/index.html">批量信道规范化</a></li>
<li><a href="normalization/deep_norm/index.html">深度规范</a></li></ul>
<h4><a href="distillation/index.html">蒸馏</a></h4>
<h4><a href="adaptive_computation/index.html">自适应计算</a></h4>
<ul><li><a href="adaptive_computation/ponder_net/index.html">PonderNet</a></li></ul>
<h4><a href="uncertainty/index.html">不确定性</a></h4>
<ul><li><a href="uncertainty/evidence/index.html">用于量化分类不确定性的证据性深度学习</a></li></ul>
<h4><a href="activations/index.html">激活</a></h4>
<ul><li><a href="activations/fta/index.html">模糊平铺激活</a></li></ul>
<h4><a href="sampling/index.html">语言模型采样技术</a></h4>
<ul><li><a href="sampling/greedy.html">贪婪采样</a></li>
<li><a href="sampling/temperature.html">温度采样</a></li>
<li><a href="sampling/top_k.html">前 k 个采样</a></li>
<li><a href="sampling/nucleus.html">原子核采样</a></li></ul>
<h4><a href="scaling/index.html">可扩展的训练/推理</a></h4>
<ul><li><a href="scaling/zero3/index.html">Zero3 内存优化</a></li></ul>
<h2>重点研究论文 PDF</h2>
<ul><li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2204.10628.pdf">Autoregressive Search Engines: Generating Substrings as Document Identifiers</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2203.15556.pdf">Training Compute-Optimal Large Language Models</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/1910.02054.pdf">ZeRO: Memory Optimizations Toward Training Trillion Parameter Models</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2204.02311.pdf">PaLM: Scaling Language Modeling with Pathways</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/dall-e-2.pdf">Hierarchical Text-Conditional Image Generation with CLIP Latents</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2203.14465.pdf">STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2112.04426.pdf">Improving language models by retrieving from trillions of tokens</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2003.08934.pdf">NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/1706.03762.pdf">Attention Is All You Need</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2006.11239.pdf">Denoising Diffusion Probabilistic Models</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2109.08668.pdf">Primer: Searching for Efficient Transformers for Language Modeling</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/1803.02999.pdf">On First-Order Meta-Learning Algorithms</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2103.00020.pdf">Learning Transferable Visual Models From Natural Language Supervision</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/2109.02869.pdf">The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/1805.09801.pdf">Meta-Gradient Reinforcement Learning</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/google_maps_eta.pdf">ETA Prediction with Graph Neural Networks in Google Maps</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/ponder_net.pdf">PonderNet: Learning to Ponder</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/muzero.pdf">Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/gans_n_roses.pdf">GANs N Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/vit.pdf">An Image is Worth 16X16 Word: Transformers for Image Recognition at Scale</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/resnet.pdf">Deep Residual Learning for Image Recognition</a> </li>
<li><a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/blob/master/papers/distillation.pdf">Distilling the Knowledge in a Neural Network</a></li></ul>
<h3>安装</h3>
<pre class="highlight lang-bash"><code><span></span>pip install labml-nn</code></pre>
<h3>引用 LabML</h3>
<p>如果您将其用于学术研究,请使用以下 BibTeX 条目引用它。</p>
<pre class="highlight lang-bibtex"><code><span></span><span class="nc">@misc</span><span class="p">{</span><span class="nl">labml</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="na">author</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">{Varuna Jayasiri, Nipun Wijerathne}</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="na">title</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">{labml.ai Annotated Paper Implementations}</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="na">year</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">{2020}</span><span class="p">,</span><span class="w"></span>
<span class="w"> </span><span class="na">url</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="s">{}</span><span class="p">,</span><span class="w"></span>
<span class="p">}</span><span class="w"></span></code></pre>
</div>
<div class='code'>
<div class="highlight"><pre></pre></div>
</div>
</div>
<div class='footer'>
<a href="https://papers.labml.ai">Trending Research Papers</a>
<a href="https://labml.ai">labml.ai</a>
</div>
</div>
<script src=./interactive.js?v=1"></script>
<script>
function handleImages() {
var images = document.querySelectorAll('p>img')
for (var i = 0; i < images.length; ++i) {
handleImage(images[i])
}
}
function handleImage(img) {
img.parentElement.style.textAlign = 'center'
var modal = document.createElement('div')
modal.id = 'modal'
var modalContent = document.createElement('div')
modal.appendChild(modalContent)
var modalImage = document.createElement('img')
modalContent.appendChild(modalImage)
var span = document.createElement('span')
span.classList.add('close')
span.textContent = 'x'
modal.appendChild(span)
img.onclick = function () {
console.log('clicked')
document.body.appendChild(modal)
modalImage.src = img.src
}
span.onclick = function () {
document.body.removeChild(modal)
}
}
handleImages()
</script>
</body>
</html>