Files
Varuna Jayasiri ef7268e89c si
2023-02-27 14:18:36 +05:30

583 lines
47 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="si">
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<meta name="description" content="PyTorch ජනප්රිය ශ්රේණියේ සම්භවය පදනම් කරගත් ප්රශස්තිකරණවල PyTorch ක්රියාත්මක කිරීම/නිබන්ධන සමූහයක්. දැනට ආදම්, AMSGrad සහ RaDam ප්රශස්තකරණය ඇතුළත් වේ."/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:image:src" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
<meta name="twitter:title" content="ප්රශස්තකරණය"/>
<meta name="twitter:description" content="PyTorch ජනප්රිය ශ්රේණියේ සම්භවය පදනම් කරගත් ප්රශස්තිකරණවල PyTorch ක්රියාත්මක කිරීම/නිබන්ධන සමූහයක්. දැනට ආදම්, AMSGrad සහ RaDam ප්රශස්තකරණය ඇතුළත් වේ."/>
<meta name="twitter:site" content="@labmlai"/>
<meta name="twitter:creator" content="@labmlai"/>
<meta property="og:url" content="https://nn.labml.ai/optimizers/index.html"/>
<meta property="og:title" content="ප්රශස්තකරණය"/>
<meta property="og:image" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
<meta property="og:site_name" content="ප්රශස්තකරණය"/>
<meta property="og:type" content="object"/>
<meta property="og:title" content="ප්රශස්තකරණය"/>
<meta property="og:description" content="PyTorch ජනප්රිය ශ්රේණියේ සම්භවය පදනම් කරගත් ප්රශස්තිකරණවල PyTorch ක්රියාත්මක කිරීම/නිබන්ධන සමූහයක්. දැනට ආදම්, AMSGrad සහ RaDam ප්රශස්තකරණය ඇතුළත් වේ."/>
<title>ප්රශස්තකරණය</title>
<link rel="shortcut icon" href="/icon.png"/>
<link rel="stylesheet" href="../pylit.css?v=1">
<link rel="canonical" href="https://nn.labml.ai/optimizers/index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.13.18/dist/katex.min.css" integrity="sha384-zTROYFVGOfTw7JV7KUu8udsvW2fx4lWOsCEDqhBreBwlHI4ioVRtmIvEThzJHGET" crossorigin="anonymous">
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-4V3HC8HBLH"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'G-4V3HC8HBLH');
</script>
</head>
<body>
<div id='container'>
<div id="background"></div>
<div class='section'>
<div class='docs'>
<p>
<a class="parent" href="/">home</a>
<a class="parent" href="index.html">optimizers</a>
</p>
<p>
<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations" target="_blank">
<img alt="Github"
src="https://img.shields.io/github/stars/labmlai/annotated_deep_learning_paper_implementations?style=social"
style="max-width:100%;"/></a>
<a href="https://twitter.com/labmlai" rel="nofollow" target="_blank">
<img alt="Twitter"
src="https://img.shields.io/twitter/follow/labmlai?style=social"
style="max-width:100%;"/></a>
</p>
<p>
<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/optimizers/__init__.py" target="_blank">
View code on Github</a>
</p>
</div>
</div>
<div class='section' id='section-0'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-0'>#</a>
</div>
<h1>ප්රශස්තකරණය</h1>
<h2>ප්රශස්තිකරණක්රියාත්මක කිරීම්</h2>
<ul><li><a href="adam.html">ආදම් ප්රශස්තකරණය</a> </li>
<li><a href="amsgrad.html">AMSGrad ප්රශස්තකරණය</a> </li>
<li><a href="adam_warmup.html">උණුසුම් කිරීම සමඟ ආදම් ප්රශස්තකරණය</a> </li>
<li><a href="noam.html">නව ප්රශස්තකරණය</a> </li>
<li><a href="radam.html">නිවැරදි කරන ලද ආදම් ප්රශස්තකරණය</a> </li>
<li><a href="ada_belief.html">ADABelief ප්රශස්තකරණය</a></li></ul>
<p>මෙම <a href="mnist_experiment.html">MNIST උදාහරණය</a> මෙම ප්රශස්තකරණය භාවිතා කරයි. </p>
<h2>Genericබලපත්රය යටතේ අවසර ලබා ඇත අනුවර්තී ප්රතිස්ඨාපනය මූලික පන්තිය</h2>
<p>මෙමගොනුව <em>ආදම්</em> සහ එහි දිගු සඳහා පොදු පාදක පන්තියක් අර්ථ දක්වයි. මූලික පන්තිය භාවිතා කිරීමට උපකාරී වේ නැවත භාවිතා කිරීමේ හැකියාව නිසා අවම කේතයක් සහිත වෙනත් ප්රශස්තිකාරක ක්රියාත්මක කරන්න. </p>
<p>එල්2 බර ක්ෂය වීම සඳහා විශේෂ පන්තියක් ද අපි අර්ථ දක්වන්නෙමු, එවිට එක් එක් ප්රශස්තකරණය තුළ එය ක්රියාත්මක කිරීමට අපට අවශ්ය නොවන අතර ප්රශස්තිකාරක වෙනස් නොකර L1 වැනි වෙනත් බර දිරාපත් විය හැකිය. </p>
<p>PyTorchප්රශස්තකරණය පිළිබඳ සංකල්ප කිහිපයක් මෙන්න:</p>
<h3>පරාමිතිකණ්ඩායම්</h3>
<p>PyTorchකණ්ඩායම් පරාමිතීන් කණ්ඩායම් ලෙස හැඳින්වෙන කට්ටලවලට ප්රශස්තිකරණය කරයි. සෑම කණ්ඩායමකටම ඉගෙනුම් අනුපාත වැනි තමන්ගේම අධි පරාමිතීන් තිබිය හැකිය. </p>
<p>බොහෝපොදු අවස්ථාවන්හිදී එක් කණ්ඩායමක් පමණක් වනු ඇත. මෙය ඔබ ඔබේ ප්රශස්තකරණය ආරම්භ කරන විට,</p>
<pre class="highlight lang-python"><code><span></span><span class="n">Optimizer</span><span class="p">(</span><span class="n">model</span><span class="o">.</span><span class="n">parameters</span><span class="p">())</span></code></pre>
<p>ප්රශස්තකරණයආරම්භ කිරීමේදී ඔබට බහු පරාමිති කණ්ඩායම් නිර්වචනය කළ හැකිය:</p>
<pre class="highlight lang-python"><code><span></span><span class="n">Optimizer</span><span class="p">([{</span><span class="s1">&#39;params&#39;</span><span class="p">:</span> <span class="n">model1</span><span class="o">.</span><span class="n">parameters</span><span class="p">()},</span> <span class="p">{</span><span class="s1">&#39;params&#39;</span><span class="p">:</span> <span class="n">model2</span><span class="o">.</span><span class="n">parameters</span><span class="p">(),</span> <span class="s1">&#39;lr&#39;</span><span class="p">:</span> <span class="mi">2</span><span class="p">}])</span></code></pre>
<p>මෙන්නඅපි කණ්ඩායම් ලැයිස්තුවක් සම්මත කරමු. සෑම කණ්ඩායමක්ම 'පරාමිති' යන යතුර යටතේ එහි පරාමිතීන් සහිත ශබ්දකෝෂයකි. ඔබ ඕනෑම අධි-පරාමිතීන් ද සඳහන් කරයි. අධි පරාමිතීන් අර්ථ දක්වා නොමැති නම් ඒවා ප්රශස්තිකරණ මට්ටමේ පෙරනිමි වෙත පෙරනිමිය වනු ඇත. </p>
<p>ඔබටමෙම කණ්ඩායම් වලට ප්රවේශ විය හැකිය (සහ පවා වෙනස් කරන්න), සහ ඒවායේ අධි-පරාමිතීන් සමඟ <code class="highlight"><span></span><span class="n">optimizer</span><span class="o">.</span><span class="n">param_groups</span></code>
. මට හමු වී ඇති බොහෝ ඉගෙනුම් අනුපාත කාලසටහන් ක්රියාත්මක කිරීම් මෙයට ප්රවේශ වී 'lr' වෙනස් කරයි. </p>
<h3>ජනපදය</h3>
<p>ප්රශස්තකරණයශබ්දකෝෂයක එක් එක් පරාමිතිය සඳහා (ටෙන්සර්) ප්රාන්ත (ශබ්දකෝෂයක්) පවත්වා ගනී <code class="highlight"><span></span><span class="n">optimizer</span><span class="o">.</span><span class="n">state</span></code>
. ප්රශස්තකරණය on ාතීය සාමාන්යය වැනි දේවල් පවත්වා ගෙන යන්නේ මෙහිදීය. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">62</span><span></span><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Dict</span><span class="p">,</span> <span class="n">Tuple</span><span class="p">,</span> <span class="n">Any</span>
<span class="lineno">63</span>
<span class="lineno">64</span><span class="kn">import</span> <span class="nn">torch</span>
<span class="lineno">65</span><span class="kn">from</span> <span class="nn">torch</span> <span class="kn">import</span> <span class="n">nn</span>
<span class="lineno">66</span><span class="kn">from</span> <span class="nn">torch.optim.optimizer</span> <span class="kn">import</span> <span class="n">Optimizer</span></pre></div>
</div>
</div>
<div class='section' id='section-1'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-1'>#</a>
</div>
<h2><em>ආදම්</em> සහ දිගු සඳහා මූලික පන්තිය</h2>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">69</span><span class="k">class</span> <span class="nc">GenericAdaptiveOptimizer</span><span class="p">(</span><span class="n">Optimizer</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-2'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-2'>#</a>
</div>
<h3>ආරම්භකරන්න</h3>
<ul><li><code class="highlight"><span></span><span class="n">params</span></code>
යනු පරාමිති එකතු කිරීම හෝ පරාමිති කණ්ඩායම් සමූහයකි. </li>
<li><code class="highlight"><span></span><span class="n">defaults</span></code>
පෙරනිමි අධි-පරාමිතීන්ගේ ශබ්ද කෝෂයක් </li>
<li><code class="highlight"><span></span><span class="n">lr</span></code>
ඉගෙනුම් අනුපාතය, <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.0037em;">α</span></span></span></span></span> </li>
<li><code class="highlight"><span></span><span class="n">betas</span></code>
මෙම tuple වේ <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.05278em;">β</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.05278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.16666666666666666em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.05278em;">β</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.30110799999999993em;"><span style="top:-2.5500000000000003em;margin-left:-0.05278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span></span> </li>
</ul><li><code class="highlight"><span></span><span class="n">eps</span></code>
වේ <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.43056em;vertical-align:0em;"></span><span class="mord mathnormal">ϵ</span></span></span></span></span></li>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">74</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">params</span><span class="p">,</span> <span class="n">defaults</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">Any</span><span class="p">],</span> <span class="n">lr</span><span class="p">:</span> <span class="nb">float</span><span class="p">,</span> <span class="n">betas</span><span class="p">:</span> <span class="n">Tuple</span><span class="p">[</span><span class="nb">float</span><span class="p">,</span> <span class="nb">float</span><span class="p">],</span> <span class="n">eps</span><span class="p">:</span> <span class="nb">float</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-3'>
<div class='docs'>
<div class='section-link'>
<a href='#section-3'>#</a>
</div>
<p>අධි-පරාමිතීන්පරීක්ෂා කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">86</span> <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">lr</span><span class="p">:</span>
<span class="lineno">87</span> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Invalid learning rate: </span><span class="si">{</span><span class="n">lr</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
<span class="lineno">88</span> <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">eps</span><span class="p">:</span>
<span class="lineno">89</span> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Invalid epsilon value: </span><span class="si">{</span><span class="n">eps</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
<span class="lineno">90</span> <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">betas</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mf">1.0</span><span class="p">:</span>
<span class="lineno">91</span> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Invalid beta parameter at index 0: </span><span class="si">{</span><span class="n">betas</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
<span class="lineno">92</span> <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">betas</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">&lt;</span> <span class="mf">1.0</span><span class="p">:</span>
<span class="lineno">93</span> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Invalid beta parameter at index 1: </span><span class="si">{</span><span class="n">betas</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-4'>
<div class='docs'>
<div class='section-link'>
<a href='#section-4'>#</a>
</div>
<p>පෙරනිමිසඳහා අධි-පරාමිතීන් එකතු කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">96</span> <span class="n">defaults</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="nb">dict</span><span class="p">(</span><span class="n">lr</span><span class="o">=</span><span class="n">lr</span><span class="p">,</span> <span class="n">betas</span><span class="o">=</span><span class="n">betas</span><span class="p">,</span> <span class="n">eps</span><span class="o">=</span><span class="n">eps</span><span class="p">))</span></pre></div>
</div>
</div>
<div class='section' id='section-5'>
<div class='docs'>
<div class='section-link'>
<a href='#section-5'>#</a>
</div>
<p>PyTorchප්රශස්තකරණය ආරම්භ කරන්න. මෙය පෙරනිමි අධි-පරාමිතීන් සහිත පරාමිති කණ්ඩායම් නිර්මාණය කරනු ඇත </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">99</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">params</span><span class="p">,</span> <span class="n">defaults</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-6'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-6'>#</a>
</div>
<h3>දීඇති පරාමිතිය tensor සඳහා රාජ්ය ආරම්භ</h3>
<p>පරාමිතීන් <code class="highlight"><span></span><span class="n">state</span></code>
සඳහා ආරම්භ කිරීම සඳහා මෙය කේතය සමඟ ඉක්මවා යා යුතුය <code class="highlight"><span></span><span class="n">param</span></code>
. <code class="highlight"><span></span><span class="n">group</span></code>
යනු පරාමිති කණ්ඩායම් ශබ්ද කෝෂය <code class="highlight"><span></span><span class="n">param</span></code>
අයත් වේ. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">101</span> <span class="k">def</span> <span class="nf">init_state</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">state</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">any</span><span class="p">],</span> <span class="n">group</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">any</span><span class="p">],</span> <span class="n">param</span><span class="p">:</span> <span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-7'>
<div class='docs'>
<div class='section-link'>
<a href='#section-7'>#</a>
</div>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">108</span> <span class="k">pass</span></pre></div>
</div>
</div>
<div class='section' id='section-8'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-8'>#</a>
</div>
<h3>පරාමිතිආතතියක් මත ප්රශස්තිකරණ පියවර ගන්න</h3>
<p>මෙමoverridden හා <code class="highlight"><span></span><span class="n">param</span></code>
tensor මත ප්රශස්තිකරණය පියවර ගත යුතු <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.69444em;vertical-align:0em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">θ</span></span></span></span></span>, එම පරාමිතිය සඳහා ඵලය අනුක්රමික කොහෙද <code class="highlight"><span></span><span class="n">grad</span></code>
, <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.19444em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2805559999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.03588em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span></span>, <code class="highlight"><span></span><span class="n">state</span></code>
යනු එම පරාමිතිය සඳහා ප්රශස්තිකරණ රාජ්ය ශබ්ද කෝෂය <code class="highlight"><span></span><span class="n">group</span></code>
වන අතර පරාමිති කණ්ඩායම් <code class="highlight"><span></span><span class="n">param</span></code>
ශබ්දකෝෂයට අයත් වේ. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">110</span> <span class="k">def</span> <span class="nf">step_param</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">state</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">any</span><span class="p">],</span> <span class="n">group</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">any</span><span class="p">],</span> <span class="n">grad</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">,</span> <span class="n">param</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-9'>
<div class='docs'>
<div class='section-link'>
<a href='#section-9'>#</a>
</div>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">119</span> <span class="k">pass</span></pre></div>
</div>
</div>
<div class='section' id='section-10'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-10'>#</a>
</div>
<h3>පියවරප්රශස්තකරණය</h3>
<p><em>ආදම්</em> මත පදනම් වූ ප්රශස්තිකරණ අවශ්යතා සෑම පොදු දේවල් කරන අච්චු ක්රමයක් අපි නිර්මාණය කර ඇත්තෙමු. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">121</span> <span class="nd">@torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">()</span>
<span class="lineno">122</span> <span class="k">def</span> <span class="nf">step</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">closure</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-11'>
<div class='docs'>
<div class='section-link'>
<a href='#section-11'>#</a>
</div>
<p>අලාභයගණනය කරන්න. </p>
<p>🤔ඔබට මෙය අවශ්ය විට මට විශ්වාස නැත. මම හිතන්නේ එය ඔබ අලාභය ගණනය කරන, කරන <code class="highlight"><span></span><span class="n">loss</span><span class="o">.</span><span class="n">backward</span></code>
සහ අලාභය ආපසු ලබා දෙන ශ්රිතයක් අර්ථ දැක්වුවහොත්, එය තනිවම අමතනවා වෙනුවට ඔබට එය සම්මත කළ හැකිය <code class="highlight"><span></span><span class="n">optimizer</span><span class="o">.</span><span class="n">step</span></code>
. 🤷‍♂️ </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">133</span> <span class="n">loss</span> <span class="o">=</span> <span class="kc">None</span>
<span class="lineno">134</span> <span class="k">if</span> <span class="n">closure</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
<span class="lineno">135</span> <span class="k">with</span> <span class="n">torch</span><span class="o">.</span><span class="n">enable_grad</span><span class="p">():</span>
<span class="lineno">136</span> <span class="n">loss</span> <span class="o">=</span> <span class="n">closure</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-12'>
<div class='docs'>
<div class='section-link'>
<a href='#section-12'>#</a>
</div>
<p>පරාමිතිකණ්ඩායම් හරහා නැවත ක්රියාත්මක කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">139</span> <span class="k">for</span> <span class="n">group</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">param_groups</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-13'>
<div class='docs'>
<div class='section-link'>
<a href='#section-13'>#</a>
</div>
<p>පරාමිතිකණ්ඩායමේ පරාමිතීන් හරහා නැවත ක්රියාත්මක කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">141</span> <span class="k">for</span> <span class="n">param</span> <span class="ow">in</span> <span class="n">group</span><span class="p">[</span><span class="s1">&#39;params&#39;</span><span class="p">]:</span></pre></div>
</div>
</div>
<div class='section' id='section-14'>
<div class='docs'>
<div class='section-link'>
<a href='#section-14'>#</a>
</div>
<p>පරාමිතියටකිසිදු අනුක්රමික නොමැති නම් මඟ හරින්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">143</span> <span class="k">if</span> <span class="n">param</span><span class="o">.</span><span class="n">grad</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="lineno">144</span> <span class="k">continue</span></pre></div>
</div>
</div>
<div class='section' id='section-15'>
<div class='docs'>
<div class='section-link'>
<a href='#section-15'>#</a>
</div>
<p>ශ්රේණියේආතතිය ලබා ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">146</span> <span class="n">grad</span> <span class="o">=</span> <span class="n">param</span><span class="o">.</span><span class="n">grad</span><span class="o">.</span><span class="n">data</span></pre></div>
</div>
</div>
<div class='section' id='section-16'>
<div class='docs'>
<div class='section-link'>
<a href='#section-16'>#</a>
</div>
<p>අපිවිරල අනුක්රමික හැසිරවිය නැහැ </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">148</span> <span class="k">if</span> <span class="n">grad</span><span class="o">.</span><span class="n">is_sparse</span><span class="p">:</span>
<span class="lineno">149</span> <span class="k">raise</span> <span class="ne">RuntimeError</span><span class="p">(</span><span class="s1">&#39;GenericAdaptiveOptimizer does not support sparse gradients,&#39;</span>
<span class="lineno">150</span> <span class="s1">&#39; please consider SparseAdam instead&#39;</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-17'>
<div class='docs'>
<div class='section-link'>
<a href='#section-17'>#</a>
</div>
<p>පරාමිතියසඳහා රාජ්ය ලබා ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">153</span> <span class="n">state</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">state</span><span class="p">[</span><span class="n">param</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-18'>
<div class='docs'>
<div class='section-link'>
<a href='#section-18'>#</a>
</div>
<p>රාජ්යuninitialized නම් රාජ්ය ආරම්භ </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">156</span> <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">state</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="lineno">157</span> <span class="bp">self</span><span class="o">.</span><span class="n">init_state</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">group</span><span class="p">,</span> <span class="n">param</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-19'>
<div class='docs'>
<div class='section-link'>
<a href='#section-19'>#</a>
</div>
<p>පරාමිතියමත ප්රශස්තිකරණ පියවර ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">160</span> <span class="bp">self</span><span class="o">.</span><span class="n">step_param</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">group</span><span class="p">,</span> <span class="n">grad</span><span class="p">,</span> <span class="n">param</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-20'>
<div class='docs'>
<div class='section-link'>
<a href='#section-20'>#</a>
</div>
<p>වසාදැමීමෙන් ගණනය කරන ලද අලාභය ආපසු ලබා දෙන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">163</span> <span class="k">return</span> <span class="n">loss</span></pre></div>
</div>
</div>
<div class='section' id='section-21'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-21'>#</a>
</div>
<h2>L2සිරුරේ බර ක්ෂය</h2>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">166</span><span class="k">class</span> <span class="nc">WeightDecay</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-22'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-22'>#</a>
</div>
<h3>බරක්ෂය වීම ආරම්භ කරන්න</h3>
<ul><li><code class="highlight"><span></span><span class="n">weight_decay</span></code>
ක්ෂය සංගුණකය වේ </li>
<li><code class="highlight"><span></span><span class="n">weight_decouple</span></code>
යනු බර ක්ෂය වීම ශ්රේණියට එකතු කළ යුතුද යන්න හෝ පරාමිතියෙන් කෙලින්ම ක්ෂය වීම පෙන්නුම් කරන ධජයකි. ශ්රේණියට එකතු කළහොත් එය සාමාන්ය ප්රශස්තිකරණ යාවත්කාලීනය හරහා ගමන් කරයි. </li>
<li><code class="highlight"><span></span><span class="n">absolute</span></code>
මෙම ධජය මඟින් බර ක්ෂය වීමේ සංගුණකය නිරපේක්ෂ ද යන්න පෙන්නුම් කරයි. ක්ෂය වීම පරාමිතිය මත සෘජුවම සිදු කරන විට මෙය අදාළ වේ. මෙය අසත්යයක් නම් සැබෑ ක්ෂය වීමයි <code class="highlight"><span></span><span class="n">weight_decay</span></code>
</li>
<li><code class="highlight"><span></span><span class="n">learning_rate</span></code>
. </li></ul>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">171</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">weight_decay</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.</span><span class="p">,</span> <span class="n">weight_decouple</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">True</span><span class="p">,</span> <span class="n">absolute</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">False</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-23'>
<div class='docs'>
<div class='section-link'>
<a href='#section-23'>#</a>
</div>
<p>අධිපරාමිතීන් පරීක්ෂා කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">184</span> <span class="k">if</span> <span class="ow">not</span> <span class="mf">0.0</span> <span class="o">&lt;=</span> <span class="n">weight_decay</span><span class="p">:</span>
<span class="lineno">185</span> <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="sa">f</span><span class="s2">&quot;Invalid weight_decay value: </span><span class="si">{</span><span class="n">weight_decay</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
<span class="lineno">186</span>
<span class="lineno">187</span> <span class="bp">self</span><span class="o">.</span><span class="n">absolute</span> <span class="o">=</span> <span class="n">absolute</span>
<span class="lineno">188</span> <span class="bp">self</span><span class="o">.</span><span class="n">weight_decouple</span> <span class="o">=</span> <span class="n">weight_decouple</span>
<span class="lineno">189</span> <span class="bp">self</span><span class="o">.</span><span class="n">weight_decay</span> <span class="o">=</span> <span class="n">weight_decay</span></pre></div>
</div>
</div>
<div class='section' id='section-24'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-24'>#</a>
</div>
<p> පරාමිතිකණ්ඩායම් සඳහා ආපසු පැහැර හැරීම්</p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">191</span> <span class="k">def</span> <span class="nf">defaults</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-25'>
<div class='docs'>
<div class='section-link'>
<a href='#section-25'>#</a>
</div>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">195</span> <span class="k">return</span> <span class="nb">dict</span><span class="p">(</span><span class="n">weight_decay</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">weight_decay</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-26'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-26'>#</a>
</div>
<h3>බරක්ෂය වීම සිදු කර ශ්රේණිය නැවත ලබා දෙන්න</h3>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">197</span> <span class="k">def</span> <span class="fm">__call__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">param</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">,</span> <span class="n">grad</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">,</span> <span class="n">group</span><span class="p">:</span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">any</span><span class="p">]):</span></pre></div>
</div>
</div>
<div class='section' id='section-27'>
<div class='docs'>
<div class='section-link'>
<a href='#section-27'>#</a>
</div>
<p>අපිපරාමිතිය මත ක්ෂය වීම කෙලින්ම කරන්නේ නම් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">203</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">weight_decouple</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-28'>
<div class='docs'>
<div class='section-link'>
<a href='#section-28'>#</a>
</div>
<p>බරක්ෂය වීමේ සංගුණකය නිරපේක්ෂ නම් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">205</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">absolute</span><span class="p">:</span>
<span class="lineno">206</span> <span class="n">param</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">mul_</span><span class="p">(</span><span class="mf">1.0</span> <span class="o">-</span> <span class="n">group</span><span class="p">[</span><span class="s1">&#39;weight_decay&#39;</span><span class="p">])</span></pre></div>
</div>
</div>
<div class='section' id='section-29'>
<div class='docs'>
<div class='section-link'>
<a href='#section-29'>#</a>
</div>
<p>එසේනොමැති නම් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">208</span> <span class="k">else</span><span class="p">:</span>
<span class="lineno">209</span> <span class="n">param</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">mul_</span><span class="p">(</span><span class="mf">1.0</span> <span class="o">-</span> <span class="n">group</span><span class="p">[</span><span class="s1">&#39;lr&#39;</span><span class="p">]</span> <span class="o">*</span> <span class="n">group</span><span class="p">[</span><span class="s1">&#39;weight_decay&#39;</span><span class="p">])</span></pre></div>
</div>
</div>
<div class='section' id='section-30'>
<div class='docs'>
<div class='section-link'>
<a href='#section-30'>#</a>
</div>
<p>නවීකරණයනොකළ ශ්රේණිය ආපසු ලබා දෙන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">211</span> <span class="k">return</span> <span class="n">grad</span>
<span class="lineno">212</span> <span class="k">else</span><span class="p">:</span>
<span class="lineno">213</span> <span class="k">if</span> <span class="n">group</span><span class="p">[</span><span class="s1">&#39;weight_decay&#39;</span><span class="p">]</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-31'>
<div class='docs'>
<div class='section-link'>
<a href='#section-31'>#</a>
</div>
<p>බරක්ෂය වීම ශ්රේණියට එකතු කර නවීකරණය කරන ලද ශ්රේණිය නැවත ලබා දෙන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">215</span> <span class="k">return</span> <span class="n">grad</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">param</span><span class="o">.</span><span class="n">data</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="n">group</span><span class="p">[</span><span class="s1">&#39;weight_decay&#39;</span><span class="p">])</span>
<span class="lineno">216</span> <span class="k">else</span><span class="p">:</span>
<span class="lineno">217</span> <span class="k">return</span> <span class="n">grad</span></pre></div>
</div>
</div>
<div class='footer'>
<a href="https://papers.labml.ai">Trending Research Papers</a>
<a href="https://labml.ai">labml.ai</a>
</div>
</div>
<script src=../interactive.js?v=1"></script>
<script>
function handleImages() {
var images = document.querySelectorAll('p>img')
for (var i = 0; i < images.length; ++i) {
handleImage(images[i])
}
}
function handleImage(img) {
img.parentElement.style.textAlign = 'center'
var modal = document.createElement('div')
modal.id = 'modal'
var modalContent = document.createElement('div')
modal.appendChild(modalContent)
var modalImage = document.createElement('img')
modalContent.appendChild(modalImage)
var span = document.createElement('span')
span.classList.add('close')
span.textContent = 'x'
modal.appendChild(span)
img.onclick = function () {
console.log('clicked')
document.body.appendChild(modal)
modalImage.src = img.src
}
span.onclick = function () {
document.body.removeChild(modal)
}
}
handleImages()
</script>
</body>
</html>