Files
Varuna Jayasiri ef7268e89c si
2023-02-27 14:18:36 +05:30

1851 lines
127 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="si">
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<meta name="description" content="මෙය PyTorch හි ලියා ඇති ශුන්ය-ඩීපී මතක ප්රශස්තිකරණය ක්රියාත්මක කිරීමයි."/>
<meta name="twitter:card" content="summary"/>
<meta name="twitter:image:src" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
<meta name="twitter:title" content="ශුන්ය-ඩීපී මතක ප්රශස්තිකරණය"/>
<meta name="twitter:description" content="මෙය PyTorch හි ලියා ඇති ශුන්ය-ඩීපී මතක ප්රශස්තිකරණය ක්රියාත්මක කිරීමයි."/>
<meta name="twitter:site" content="@labmlai"/>
<meta name="twitter:creator" content="@labmlai"/>
<meta property="og:url" content="https://nn.labml.ai/scaling/zero3/index.html"/>
<meta property="og:title" content="ශුන්ය-ඩීපී මතක ප්රශස්තිකරණය"/>
<meta property="og:image" content="https://avatars1.githubusercontent.com/u/64068543?s=400&amp;v=4"/>
<meta property="og:site_name" content="ශුන්ය-ඩීපී මතක ප්රශස්තිකරණය"/>
<meta property="og:type" content="object"/>
<meta property="og:title" content="ශුන්ය-ඩීපී මතක ප්රශස්තිකරණය"/>
<meta property="og:description" content="මෙය PyTorch හි ලියා ඇති ශුන්ය-ඩීපී මතක ප්රශස්තිකරණය ක්රියාත්මක කිරීමයි."/>
<title>ශුන්ය-ඩීපී මතක ප්රශස්තිකරණය</title>
<link rel="shortcut icon" href="/icon.png"/>
<link rel="stylesheet" href="../../pylit.css?v=1">
<link rel="canonical" href="https://nn.labml.ai/scaling/zero3/index.html"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.13.18/dist/katex.min.css" integrity="sha384-zTROYFVGOfTw7JV7KUu8udsvW2fx4lWOsCEDqhBreBwlHI4ioVRtmIvEThzJHGET" crossorigin="anonymous">
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-4V3HC8HBLH"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag() {
dataLayer.push(arguments);
}
gtag('js', new Date());
gtag('config', 'G-4V3HC8HBLH');
</script>
</head>
<body>
<div id='container'>
<div id="background"></div>
<div class='section'>
<div class='docs'>
<p>
<a class="parent" href="/">home</a>
<a class="parent" href="../index.html">scaling</a>
<a class="parent" href="index.html">zero3</a>
</p>
<p>
<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations" target="_blank">
<img alt="Github"
src="https://img.shields.io/github/stars/labmlai/annotated_deep_learning_paper_implementations?style=social"
style="max-width:100%;"/></a>
<a href="https://twitter.com/labmlai" rel="nofollow" target="_blank">
<img alt="Twitter"
src="https://img.shields.io/twitter/follow/labmlai?style=social"
style="max-width:100%;"/></a>
</p>
<p>
<a href="https://github.com/labmlai/annotated_deep_learning_paper_implementations/tree/master/labml_nn/scaling/zero3/__init__.py" target="_blank">
View code on Github</a>
</p>
</div>
</div>
<div class='section' id='section-0'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-0'>#</a>
</div>
<h1>ශුන්ය-ඩීපීමතක ප්රශස්තිකරණය</h1>
<p>මෙයකඩදාසි වල හඳුන්වා දුන් ශුන්ය-ඩීපී ක්රියාත්මක කිරීමකි <a href="https://papers.labml.ai/paper/1910.02054">ශුන්ය: පුහුණුව සඳහා මතක ප්රශස්තිකරණය ට්රිලියනයක් පරාමිති ආකෘති</a>,</p>
<p>එයප්රශස්තිකරණ තත්වයේ, ශ්රේණියේ සහ පරාමිතීන් බහු උපාංග/නෝඩ් වලට තබා ගනී. එය මුල් ආකෘතියට මතක පරිභෝජනය අඩු කරයි, පරාමිති ගණන <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord coloredeq eqf" style=""><span class="mord" style="">Ψ</span></span></span></span></span></span> කොතැනද, කැබලි ගණන ද <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord coloredeq eqi" style=""><span class="mord mathnormal" style="margin-right:0.07153em">K</span></span></span></span></span></span> , <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:1.4608599999999998em;vertical-align:-0.4508599999999999em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.01em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight coloredeq eqg" style=""><span class="mord mtight" style=""><span class="mord mathnormal mtight" style="margin-right:0.10903em">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3487714285714287em;margin-left:-0.10903em;margin-right:0.07142857142857144em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight" style=""><span class="mord mathnormal mtight" style="">d</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15122857142857138em;"><span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.485em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mtight coloredeq eqe" style=""><span class="mord mtight" style="">2</span><span class="mbin mtight" style="">+</span><span class="mord mtight" style="">2</span></span><span class="mbin mtight">+</span><span class="mord mtight coloredeq eqi" style=""><span class="mord mathnormal mtight" style="margin-right:0.07153em">K</span></span><span class="mclose mtight">)</span><span class="mord mtight coloredeq eqf" style=""><span class="mord mtight" style="">Ψ</span></span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.4508599999999999em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span></span> <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.83333em;vertical-align:-0.15em;"></span><span class="mord coloredeq eqg" style=""><span class="mord" style=""><span class="mord mathnormal" style="margin-right:0.10903em">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.33610799999999996em;"><span style="top:-2.5500000000000003em;margin-left:-0.10903em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight" style=""><span class="mord mathnormal mtight" style="">d</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span></span></span> පරාමිතියකට ප්රශස්තිකරණ බයිට් ගණන. <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span><span class="mord coloredeq eqe" style=""><span class="mord" style="">2</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin" style="">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mord" style="">2</span></span></span></span></span></span> බිට් 16-නිරවද්යතාව උපකල්පනය කරන පරාමිතිය සහ ශ්රේණියේ මතකය; එනම් පරාමිතියට බයිට් 2 ක් සහ අනුක්රමික ය. <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord coloredeq eqi" style=""><span class="mord mathnormal" style="margin-right:0.07153em">K</span></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord">12</span></span></span></span></span> සඳහා ආදම් ප්රශස්තකරණය එය පරාමිතීන්ගේ පිටපතක් සහ fp32 හි පරාමිතියකට අවස්ථා දෙකක් පවත්වා ගෙන යන බැවිනි. </p>
<p>ශුන්ය-ඩීපීහි සන්නිවේදන පරිමාව වේ <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathcal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord">3</span><span class="mord coloredeq eqf" style=""><span class="mord" style="">Ψ</span></span><span class="mclose">)</span></span></span></span></span>. සංසන්දනය සඳහා දත්ත-සමාන්තර පුහුණුව සන්නිවේදන පරිමාවක් <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathcal" style="margin-right:0.02778em;">O</span><span class="mopen">(</span><span class="mord">2</span><span class="mord coloredeq eqf" style=""><span class="mord" style="">Ψ</span></span><span class="mclose">)</span></span></span></span></span>ඇත. </p>
<p>මෙයනම් කර ඇතත් <code class="highlight"><span></span><span class="n">Zero3</span></code>
, අප ක්රියාත්මක කර ඇත්තේ එහි ශුන්ය-ඩීපී කොටස පමණක් මිස අවශේෂ මතක පරිභෝජනය ඉලක්ක කරන Zero-R මතක ප්රශස්තිකරණයන් නොවේ. පිටත ක්රියාත්මක කිරීම සඳහා පුහුණුව සඳහා සහාය වන්නේ පරාමිතීන්ගේ උප කුලකයක් පමණි. </p>
<p>මෙමක්රියාත්මක කිරීම <a href="https://fairscale.readthedocs.io/en/stable/api/nn/fsdp.html">Fairscale FSDP</a>විසින් දේවානුභාවයෙන්. </p>
<p><a href="finetune_neox.html">ZERO-DP මතක ප්රශස්තිකරණය භාවිතා කරමින් GPT Neox මනාව සකස් කිරීම සඳහා ස්ක්රිප්ට් එකක් මෙන්න</a> . </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">32</span><span></span><span class="kn">import</span> <span class="nn">functools</span>
<span class="lineno">33</span><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">Optional</span><span class="p">,</span> <span class="n">Tuple</span>
<span class="lineno">34</span>
<span class="lineno">35</span><span class="kn">import</span> <span class="nn">torch</span>
<span class="lineno">36</span><span class="kn">import</span> <span class="nn">torch.distributed</span> <span class="k">as</span> <span class="nn">dist</span>
<span class="lineno">37</span><span class="kn">from</span> <span class="nn">torch</span> <span class="kn">import</span> <span class="n">nn</span></pre></div>
</div>
</div>
<div class='section' id='section-1'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-1'>#</a>
</div>
<h2>Zero3ස්ථරය</h2>
<p>ආකෘතියේසෑම ස්ථරයක්ම (හෝ අනුයාත ස්ථර කිහිපයක එකතුවක්) මෙම මොඩියුලය තුළ ඔතා තිබිය යුතුය. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">40</span><span class="k">class</span> <span class="nc">Zero3Layer</span><span class="p">(</span><span class="n">nn</span><span class="o">.</span><span class="n">Module</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-2'>
<div class='docs'>
<div class='section-link'>
<a href='#section-2'>#</a>
</div>
<p>සෑමකොටසක්ම පරාමිතීන් <code class="highlight"><span></span><span class="n">chunk</span></code>
ලැයිස්තුවේ තබා ගනී. පුහුණු කළ හැකි පරාමිතීන් සඳහා <code class="highlight"><span></span><span class="n">chunk</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span></code>
<code class="highlight"><span></span><span class="n">chunk</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span></code>
වන අතර ස්ථාවර පරාමිතීන් සඳහා වේ. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">49</span> <span class="n">chunk</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-3'>
<div class='docs'>
<div class='section-link'>
<a href='#section-3'>#</a>
</div>
<p><code class="highlight"><span></span><span class="n">chunk</span></code>
ලැයිස්තුවේ ඇති කුට්ටි වල ප්රමාණය මෙයයි. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">51</span> <span class="n">chunk_size</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">int</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-4'>
<div class='docs'>
<div class='section-link'>
<a href='#section-4'>#</a>
</div>
<p>පළමුකුට්ටිය පුහුණු කළ හැකි පරාමිතීන් සඳහා වේ. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">53</span> <span class="n">TRAINING_PARAMS_IDX</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
</div>
</div>
<div class='section' id='section-5'>
<div class='docs'>
<div class='section-link'>
<a href='#section-5'>#</a>
</div>
<p>පුහුණුකළ හැකි සහ ස්ථාවර පරාමිතීන් ලෙස ලැයිස්තු වලට බෙදී ඇති පරාමිති ලැයිස්තුව මෙයයි. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">56</span> <span class="n">param_refs</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">List</span><span class="p">[</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">]]</span></pre></div>
</div>
</div>
<div class='section' id='section-6'>
<div class='docs'>
<div class='section-link'>
<a href='#section-6'>#</a>
</div>
<p>පරාමිතීන්පිහාටු කිරීමට CUDA ධාරාව </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">59</span> <span class="n">fetch_stream</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">Stream</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-7'>
<div class='docs'>
<div class='section-link'>
<a href='#section-7'>#</a>
</div>
<p>උපස්ථ/සමුච්චයඵලය අනුක්රමික කිරීමට CDA ඇළ </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">61</span> <span class="n">backup_stream</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">Stream</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-8'>
<div class='docs'>
<div class='section-link'>
<a href='#section-8'>#</a>
</div>
<p>මෙමස්ථරයට පෙර ස්ථර ලැයිස්තුව </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">63</span> <span class="n">prev_layer</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="s1">&#39;Zero3Layer&#39;</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-9'>
<div class='docs'>
<div class='section-link'>
<a href='#section-9'>#</a>
</div>
<p>මෙමස්ථරයට පසුව ස්ථර ලැයිස්තුව </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">65</span> <span class="n">next_layer</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="s1">&#39;Zero3Layer&#39;</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-10'>
<div class='docs'>
<div class='section-link'>
<a href='#section-10'>#</a>
</div>
<p>වත්මන්ස්ථරයේ පිහිටීම; ලඝු-සටහන් නිදොස්කරණය සඳහා මෙය භාවිතා කරයි </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">67</span> <span class="n">layer_idx</span><span class="p">:</span> <span class="nb">int</span></pre></div>
</div>
</div>
<div class='section' id='section-11'>
<div class='docs'>
<div class='section-link'>
<a href='#section-11'>#</a>
</div>
<p>පරාමිතීන්ලබා ගෙන තිබේද </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">70</span> <span class="n">is_fetched</span><span class="p">:</span> <span class="nb">bool</span></pre></div>
</div>
</div>
<div class='section' id='section-12'>
<div class='docs'>
<div class='section-link'>
<a href='#section-12'>#</a>
</div>
<p>ස්ථරයේඋපාංගය </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">73</span> <span class="n">device</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">device</span></pre></div>
</div>
</div>
<div class='section' id='section-13'>
<div class='docs'>
<div class='section-link'>
<a href='#section-13'>#</a>
</div>
<p>ස්ථරයේදත්ත වර්ගය </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">75</span> <span class="n">dtype</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">dtype</span></pre></div>
</div>
</div>
<div class='section' id='section-14'>
<div class='docs'>
<div class='section-link'>
<a href='#section-14'>#</a>
</div>
<p>එවියයුතු මොඩියුලය </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">77</span> <span class="n">module</span><span class="p">:</span> <span class="n">nn</span><span class="o">.</span><span class="n">Module</span></pre></div>
</div>
</div>
<div class='section' id='section-15'>
<div class='docs'>
<div class='section-link'>
<a href='#section-15'>#</a>
</div>
<p>නෝඩ්/උපාංගගණන දත්ත හරහා තියුණු වේ </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">79</span> <span class="n">world_size</span><span class="p">:</span> <span class="nb">int</span></pre></div>
</div>
</div>
<div class='section' id='section-16'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-16'>#</a>
</div>
<ul><li><code class="highlight"><span></span><span class="n">module</span></code>
ඔතා ගත යුතු මොඩියුලය. </li>
<li><code class="highlight"><span></span><span class="n">rank</span></code>
වත්මන් නෝඩයේ නිලය. </li>
<li><code class="highlight"><span></span><span class="n">world_size</span></code>
දත්ත හරහා තියුණු කර ඇති නෝඩ්/උපාංග ගණන. </li>
<li><code class="highlight"><span></span><span class="n">device</span></code>
ස්ථරයේ උපාංගය. </li>
<li><code class="highlight"><span></span><span class="n">dtype</span></code>
ස්ථරයේ දත්ත වර්ගය. </li></ul>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">81</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">module</span><span class="p">:</span> <span class="n">nn</span><span class="o">.</span><span class="n">Module</span><span class="p">,</span> <span class="n">rank</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">world_size</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">device</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">dtype</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-17'>
<div class='docs'>
<div class='section-link'>
<a href='#section-17'>#</a>
</div>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">89</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-18'>
<div class='docs'>
<div class='section-link'>
<a href='#section-18'>#</a>
</div>
<p>ගුණාංගආරම්භ කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">92</span> <span class="bp">self</span><span class="o">.</span><span class="n">device</span> <span class="o">=</span> <span class="n">device</span>
<span class="lineno">93</span> <span class="bp">self</span><span class="o">.</span><span class="n">dtype</span> <span class="o">=</span> <span class="n">dtype</span>
<span class="lineno">94</span> <span class="bp">self</span><span class="o">.</span><span class="n">module</span> <span class="o">=</span> <span class="n">module</span>
<span class="lineno">95</span> <span class="bp">self</span><span class="o">.</span><span class="n">prev_layer</span> <span class="o">=</span> <span class="p">[]</span>
<span class="lineno">96</span> <span class="bp">self</span><span class="o">.</span><span class="n">next_layer</span> <span class="o">=</span> <span class="p">[]</span>
<span class="lineno">97</span> <span class="bp">self</span><span class="o">.</span><span class="n">is_fetched</span> <span class="o">=</span> <span class="kc">False</span>
<span class="lineno">98</span> <span class="bp">self</span><span class="o">.</span><span class="n">world_size</span> <span class="o">=</span> <span class="n">world_size</span>
<span class="lineno">99</span> <span class="bp">self</span><span class="o">.</span><span class="n">layer_idx</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span>
<span class="lineno">100</span> <span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span> <span class="o">=</span> <span class="kc">None</span>
<span class="lineno">101</span> <span class="bp">self</span><span class="o">.</span><span class="n">backup_stream</span> <span class="o">=</span> <span class="kc">None</span>
<span class="lineno">102</span>
<span class="lineno">103</span> <span class="k">with</span> <span class="n">torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">():</span></pre></div>
</div>
</div>
<div class='section' id='section-19'>
<div class='docs'>
<div class='section-link'>
<a href='#section-19'>#</a>
</div>
<p>ස්ථරයේසියලු පරාමිතීන් එකතු කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">105</span> <span class="n">all_param_refs</span> <span class="o">=</span> <span class="p">[</span><span class="n">p</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">parameters</span><span class="p">()]</span></pre></div>
</div>
</div>
<div class='section' id='section-20'>
<div class='docs'>
<div class='section-link'>
<a href='#section-20'>#</a>
</div>
<p>පරාමිතීන්ගේහැඩය ගබඩා කර තබන්න, මන්ද ඒවා ප්රතිනිර්මාණය කිරීමට අපට පසුව අවශ්ය වේ </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">108</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">all_param_refs</span><span class="p">:</span>
<span class="lineno">109</span> <span class="n">p</span><span class="o">.</span><span class="n">_orig_shape</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">shape</span></pre></div>
</div>
</div>
<div class='section' id='section-21'>
<div class='docs'>
<div class='section-link'>
<a href='#section-21'>#</a>
</div>
<p>සියලුමපරාමිතීන් එකම වර්ගයේ තිබිය යුතුය </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">112</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">all_param_refs</span><span class="p">:</span>
<span class="lineno">113</span> <span class="k">assert</span> <span class="n">p</span><span class="o">.</span><span class="n">dtype</span> <span class="o">==</span> <span class="n">dtype</span><span class="p">,</span> <span class="s2">&quot;All parameters should have same dtype&quot;</span></pre></div>
</div>
</div>
<div class='section' id='section-22'>
<div class='docs'>
<div class='section-link'>
<a href='#section-22'>#</a>
</div>
<p>පුහුණුකළ හැකි සහ ස්ථාවර ලෙස වෙනම පරාමිතීන් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">116</span> <span class="bp">self</span><span class="o">.</span><span class="n">param_refs</span> <span class="o">=</span> <span class="p">[[</span><span class="n">p</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">all_param_refs</span> <span class="k">if</span> <span class="n">p</span><span class="o">.</span><span class="n">requires_grad</span><span class="p">],</span>
<span class="lineno">117</span> <span class="p">[</span><span class="n">p</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">all_param_refs</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">p</span><span class="o">.</span><span class="n">requires_grad</span><span class="p">]]</span>
<span class="lineno">118</span> <span class="k">del</span> <span class="n">all_param_refs</span></pre></div>
</div>
</div>
<div class='section' id='section-23'>
<div class='docs'>
<div class='section-link'>
<a href='#section-23'>#</a>
</div>
<p>මෙම <code class="highlight"><span></span><span class="n">rank</span> <span class="o">=</span> <span class="mi">0</span></code>
නෝඩය එක් එක් උපාංගය/නෝඩය ගබඩා කළ යුතු ප්රමාණය ගණනය කරනු ඇත, ඒ අනුව පරාමිතීන් බෙදා හරිනු ඇත. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">122</span> <span class="k">if</span> <span class="n">rank</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-24'>
<div class='docs'>
<div class='section-link'>
<a href='#section-24'>#</a>
</div>
<p>ඒකාබද්ධකිරීම සහ පෑඩ් පුහුණු කළ හැකි (<code class="highlight"><span></span><span class="n">merged_params</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span></code>
) සහ ස්ථාවර (<code class="highlight"><span></span><span class="n">merged_params</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span></code>
) පරාමිතීන් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">124</span> <span class="n">merged_params</span> <span class="o">=</span> <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">_merge_and_pad_params</span><span class="p">(</span><span class="n">ps</span><span class="p">)</span> <span class="k">for</span> <span class="n">ps</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">param_refs</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-25'>
<div class='docs'>
<div class='section-link'>
<a href='#section-25'>#</a>
</div>
<p>පුහුණුකළ හැකි සහ ස්ථාවර පරාම් ප්රමාණ ගණනය කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">126</span> <span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span> <span class="o">=</span> <span class="p">[(</span><span class="nb">len</span><span class="p">(</span><span class="n">p</span><span class="p">)</span> <span class="o">//</span> <span class="n">world_size</span> <span class="k">if</span> <span class="n">p</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="k">else</span> <span class="mi">0</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">merged_params</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-26'>
<div class='docs'>
<div class='section-link'>
<a href='#section-26'>#</a>
</div>
<p>ප්රමාණවිකාශනය කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">128</span> <span class="n">dist</span><span class="o">.</span><span class="n">broadcast</span><span class="p">(</span><span class="n">torch</span><span class="o">.</span><span class="n">tensor</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="n">device</span><span class="p">),</span> <span class="n">src</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="lineno">129</span> <span class="k">else</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-27'>
<div class='docs'>
<div class='section-link'>
<a href='#section-27'>#</a>
</div>
<p>ප්රමාණලබා ගැනීම සඳහා හිස් ටෙන්සරයක් සාදන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">131</span> <span class="n">chunk_size</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">tensor</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">],</span> <span class="n">device</span><span class="o">=</span><span class="n">device</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-28'>
<div class='docs'>
<div class='section-link'>
<a href='#section-28'>#</a>
</div>
<p>ප්රමාණලබා ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">133</span> <span class="n">dist</span><span class="o">.</span><span class="n">broadcast</span><span class="p">(</span><span class="n">chunk_size</span><span class="p">,</span> <span class="n">src</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="lineno">134</span> <span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span> <span class="o">=</span> <span class="n">chunk_size</span><span class="o">.</span><span class="n">tolist</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-29'>
<div class='docs'>
<div class='section-link'>
<a href='#section-29'>#</a>
</div>
<p>පුහුණුකළ හැකි (<code class="highlight"><span></span><span class="bp">self</span><span class="o">.</span><span class="n">chunk</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span></code>
) සහ ස්ථාවර (<code class="highlight"><span></span><span class="bp">self</span><span class="o">.</span><span class="n">chunk</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span></code>
) පරාමිතීන් සඳහා වත්මන් උපාංගයේ/නෝඩයේ ගබඩා කිරීම සඳහා පරාමිතීන් සාදන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">138</span> <span class="bp">self</span><span class="o">.</span><span class="n">chunk</span> <span class="o">=</span> <span class="p">[</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_empty</span><span class="p">((</span><span class="n">s</span><span class="p">,)),</span> <span class="n">requires_grad</span><span class="o">=</span><span class="n">i</span> <span class="o">==</span> <span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">)</span>
<span class="lineno">139</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">s</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">)]</span></pre></div>
</div>
</div>
<div class='section' id='section-30'>
<div class='docs'>
<div class='section-link'>
<a href='#section-30'>#</a>
</div>
<p>පුහුණුකළ හැකි සහ ස්ථාවර පරාමිතීන් ඒකාබද්ධ කිරීම සඳහා හිස් ටෙන්සරයක් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">142</span> <span class="n">chunk</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_empty</span><span class="p">((</span><span class="nb">sum</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">),))</span>
<span class="lineno">143</span>
<span class="lineno">144</span> <span class="k">if</span> <span class="n">rank</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-31'>
<div class='docs'>
<div class='section-link'>
<a href='#section-31'>#</a>
</div>
<p>පුහුණුකළ හැකි සහ ස්ථාවර පරාම් දෙකම සංයුක්ත කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">146</span> <span class="n">all_params</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">cat</span><span class="p">([</span><span class="n">p</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="n">world_size</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">merged_params</span><span class="p">],</span> <span class="n">dim</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="lineno">147</span> <span class="k">del</span> <span class="n">merged_params</span></pre></div>
</div>
</div>
<div class='section' id='section-32'>
<div class='docs'>
<div class='section-link'>
<a href='#section-32'>#</a>
</div>
<p>සියලුමනෝඩ්/උපාංග වෙත ඒවා විසුරුවා හරින්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">150</span> <span class="n">dist</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">chunk</span><span class="p">,</span> <span class="nb">list</span><span class="p">(</span><span class="n">all_params</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="nb">sum</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">))))</span>
<span class="lineno">151</span> <span class="k">del</span> <span class="n">all_params</span>
<span class="lineno">152</span> <span class="k">else</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-33'>
<div class='docs'>
<div class='section-link'>
<a href='#section-33'>#</a>
</div>
<p>පරාමිතීන්ලබා ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">154</span> <span class="n">dist</span><span class="o">.</span><span class="n">scatter</span><span class="p">(</span><span class="n">chunk</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-34'>
<div class='docs'>
<div class='section-link'>
<a href='#section-34'>#</a>
</div>
<p>කුට්ටියදත්ත එකතු කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">157</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">chunk</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">)</span>
<span class="lineno">158</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">c</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">chunk</span><span class="p">):</span>
<span class="lineno">159</span> <span class="bp">self</span><span class="o">.</span><span class="n">chunk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">data</span><span class="p">[:]</span> <span class="o">=</span> <span class="n">c</span>
<span class="lineno">160</span> <span class="k">del</span> <span class="n">chunk</span></pre></div>
</div>
</div>
<div class='section' id='section-35'>
<div class='docs'>
<div class='section-link'>
<a href='#section-35'>#</a>
</div>
<p>සාමාන්යපරාමිතීන් පිරිසිදු කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">163</span> <span class="bp">self</span><span class="o">.</span><span class="n">_cleanup_params</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-36'>
<div class='docs'>
<div class='section-link'>
<a href='#section-36'>#</a>
</div>
<p>පසුගාමීකොක්කක් එක් කරන්න. මොඩියුලයට සාපේක්ෂව අනුක්රමික ගණනය කළ විට මෙය කැඳවනු ලැබේ. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">166</span> <span class="bp">self</span><span class="o">.</span><span class="n">_backward_hook_ref</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">register_full_backward_hook</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_backward_hook</span><span class="p">)</span> <span class="c1"># type: ignore</span></pre></div>
</div>
</div>
<div class='section' id='section-37'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-37'>#</a>
</div>
<h4>සියලුපරාමිතීන් ඒකාබද්ධ කර එය බෙදිය හැකි වන පරිදි පෑඩ් කරන්න <code class="highlight"><span></span><span class="n">world_size</span></code>
. </h4>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">168</span> <span class="k">def</span> <span class="nf">_merge_and_pad_params</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">params</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-38'>
<div class='docs'>
<div class='section-link'>
<a href='#section-38'>#</a>
</div>
<p>මුළුපරාමිතීන් ගණන </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">173</span> <span class="n">size</span> <span class="o">=</span> <span class="nb">sum</span><span class="p">(</span><span class="n">p</span><span class="o">.</span><span class="n">shape</span><span class="o">.</span><span class="n">numel</span><span class="p">()</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">params</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-39'>
<div class='docs'>
<div class='section-link'>
<a href='#section-39'>#</a>
</div>
<p>එයබෙදිය නොහැකි නම් <code class="highlight"><span></span><span class="n">world_size</span></code>
, එය පෑඩ් කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">176</span> <span class="k">if</span> <span class="n">size</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">world_size</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
<span class="lineno">177</span> <span class="n">padding_fixed</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">world_size</span> <span class="o">-</span> <span class="p">(</span><span class="n">size</span> <span class="o">%</span> <span class="bp">self</span><span class="o">.</span><span class="n">world_size</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-40'>
<div class='docs'>
<div class='section-link'>
<a href='#section-40'>#</a>
</div>
<p>එසේනොමැති නම්, පෑඩ් කිරීමට අවශ්ය නැත </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">179</span> <span class="k">else</span><span class="p">:</span>
<span class="lineno">180</span> <span class="n">padding_fixed</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
</div>
</div>
<div class='section' id='section-41'>
<div class='docs'>
<div class='section-link'>
<a href='#section-41'>#</a>
</div>
<p>හිස්පුරවන tensor සාදන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">182</span> <span class="n">padding</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_empty</span><span class="p">((</span><span class="n">padding_fixed</span><span class="p">,))</span></pre></div>
</div>
</div>
<div class='section' id='section-42'>
<div class='docs'>
<div class='section-link'>
<a href='#section-42'>#</a>
</div>
<p>සියලුමපරාමිතීන් සංයුක්ත කර එය පෑඩ් කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">184</span> <span class="k">return</span> <span class="n">torch</span><span class="o">.</span><span class="n">cat</span><span class="p">([</span><span class="n">p</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">params</span><span class="p">]</span> <span class="o">+</span> <span class="p">[</span><span class="n">padding</span><span class="p">],</span> <span class="n">dim</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-43'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-43'>#</a>
</div>
<h3>පරාමිතීන්ගේපුහුණු කළ හැකි කූඩු/මඩු ලබා ගන්න. </h3>
<p>වත්මන්නෝඩයේ ප්රශස්තකරණය වෙත අප යන්නේ මෙයයි. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">186</span> <span class="k">def</span> <span class="nf">get_trainable_chunk</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">List</span><span class="p">[</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">]:</span></pre></div>
</div>
</div>
<div class='section' id='section-44'>
<div class='docs'>
<div class='section-link'>
<a href='#section-44'>#</a>
</div>
<p>පුහුණුකළ හැකි පරාමිතීන් නොමැති නම් ආපසු සහ හිස් ලැයිස්තුව </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">193</span> <span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">])</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="lineno">194</span> <span class="k">return</span> <span class="p">[]</span></pre></div>
</div>
</div>
<div class='section' id='section-45'>
<div class='docs'>
<div class='section-link'>
<a href='#section-45'>#</a>
</div>
<p>පුහුණුකළ හැකි කුට්ටිය ලැයිස්තුවක් ලෙස ආපසු එවන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">197</span> <span class="k">return</span> <span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">]]</span></pre></div>
</div>
</div>
<div class='section' id='section-46'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-46'>#</a>
</div>
<h4>දීඇති හැඩයේ හිස් ටෙන්සරයක් සාදන්න. </h4>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">199</span> <span class="k">def</span> <span class="nf">_empty</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">shape</span><span class="p">:</span> <span class="n">Tuple</span><span class="p">[</span><span class="nb">int</span><span class="p">,</span> <span class="o">...</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-47'>
<div class='docs'>
<div class='section-link'>
<a href='#section-47'>#</a>
</div>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">203</span> <span class="k">return</span> <span class="n">torch</span><span class="o">.</span><span class="n">empty</span><span class="p">(</span><span class="n">shape</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">dtype</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-48'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-48'>#</a>
</div>
<h4>පරාමිතිදත්ත පිරිසිදු කරන්න</h4>
<p>මෙයස්ථර පරාමිතීන් විසින් භාවිතා කරන සියලුම මතකය මුදා හරිනු ඇත. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">205</span> <span class="nd">@torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">()</span>
<span class="lineno">206</span> <span class="k">def</span> <span class="nf">_cleanup_params</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-49'>
<div class='docs'>
<div class='section-link'>
<a href='#section-49'>#</a>
</div>
<p>පරාමිතීන්ලබා ගත නොහැකි බව දැක්වීමට ධජය සකසන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">214</span> <span class="bp">self</span><span class="o">.</span><span class="n">is_fetched</span> <span class="o">=</span> <span class="kc">False</span></pre></div>
</div>
</div>
<div class='section' id='section-50'>
<div class='docs'>
<div class='section-link'>
<a href='#section-50'>#</a>
</div>
<p>සියලුපරාමිතීන් හරහා නැවත </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">217</span> <span class="k">for</span> <span class="n">ps</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">param_refs</span><span class="p">:</span>
<span class="lineno">218</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">ps</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-51'>
<div class='docs'>
<div class='section-link'>
<a href='#section-51'>#</a>
</div>
<p>නවමෙහෙයුම් වලට පෙර පරාමිතීන්ගේ මෙහෙයුම් සම්පූර්ණ වන තෙක් රැඳී සිටින්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">220</span> <span class="n">p</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">record_stream</span><span class="p">(</span><span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">current_stream</span><span class="p">())</span></pre></div>
</div>
</div>
<div class='section' id='section-52'>
<div class='docs'>
<div class='section-link'>
<a href='#section-52'>#</a>
</div>
<p>පරාමිතියවෙනත් කිසිවක් සමඟ ගබඩා බෙදා නොගන්නා බවට වග බලා ගන්න පරීක්ෂා කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">222</span> <span class="k">assert</span> <span class="n">p</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">storage_offset</span><span class="p">()</span> <span class="o">==</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">&quot;The tensor is not the sole occupant of the storage.&quot;</span></pre></div>
</div>
</div>
<div class='section' id='section-53'>
<div class='docs'>
<div class='section-link'>
<a href='#section-53'>#</a>
</div>
<p>ගබඩාවවෙනස් කරන්න <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord coloredeq eqh" style=""><span class="mord" style="">0</span></span></span></span></span></span>. මෙය පරාමිතිය භාවිතා කරන මතකය මුදා හරිනු ඇත. </p>
<p><strong>ස්වයංක්‍රීයප්‍රස්තාරය ඒ පිළිබඳව සඳහනක් තබා ඇති බැවින් සැකසුම මතකය මුදා <code class="highlight"><span></span><span class="n">p</span><span class="o">.</span><span class="n">data</span></code>
නොහරිනු ඇත. </strong> </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">226</span> <span class="n">p</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">storage</span><span class="p">()</span><span class="o">.</span><span class="n">resize_</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span> <span class="c1"># This is what actually clears the memory</span></pre></div>
</div>
</div>
<div class='section' id='section-54'>
<div class='docs'>
<div class='section-link'>
<a href='#section-54'>#</a>
</div>
<p>පරාමිතියටශ්රේණියේ දත්ත නොමැති බවට වග බලා ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">228</span> <span class="k">assert</span> <span class="n">p</span><span class="o">.</span><span class="n">grad</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">,</span> <span class="s1">&#39;Gradients should be None&#39;</span></pre></div>
</div>
</div>
<div class='section' id='section-55'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-55'>#</a>
</div>
<h3>සියලුකැබලි වලින් පරාමිතීන් ලබා ගන්න</h3>
<p>මෙයසියලු නෝඩ් වලින් සියලු පරාමිති දත්ත ලබා ගන්නා අතර එක් එක් නෝඩයේ පරාමිතීන් නැවත ගොඩනඟනු ඇත. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">230</span> <span class="nd">@torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">()</span>
<span class="lineno">231</span> <span class="k">def</span> <span class="nf">fetch_params</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-56'>
<div class='docs'>
<div class='section-link'>
<a href='#section-56'>#</a>
</div>
<p>මඟහැරීම දැනටමත් ලැබී ඇත </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">239</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">is_fetched</span><span class="p">:</span>
<span class="lineno">240</span> <span class="k">return</span></pre></div>
</div>
</div>
<div class='section' id='section-57'>
<div class='docs'>
<div class='section-link'>
<a href='#section-57'>#</a>
</div>
<p>ධජයසකසන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">243</span> <span class="bp">self</span><span class="o">.</span><span class="n">is_fetched</span> <span class="o">=</span> <span class="kc">True</span></pre></div>
</div>
</div>
<div class='section' id='section-58'>
<div class='docs'>
<div class='section-link'>
<a href='#section-58'>#</a>
</div>
ලබා<p>ගැනීමට හෝ බෙදා ගැනීමට කිසිවක් නොමැති නම් මඟ හරින්න. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">246</span> <span class="k">if</span> <span class="nb">sum</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="lineno">247</span> <span class="k">return</span></pre></div>
</div>
</div>
<div class='section' id='section-59'>
<div class='docs'>
<div class='section-link'>
<a href='#section-59'>#</a>
</div>
<p>සියලුකැබලි වලින් පරාමිතීන් ලබා <code class="highlight"><span></span><span class="n">fetch_stream</span></code>
ගැනීමට භාවිතා කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">250</span> <span class="k">with</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-60'>
<div class='docs'>
<div class='section-link'>
<a href='#section-60'>#</a>
</div>
<p>පරාමිතීන්ලබා ගැනීම සඳහා හිස් ටෙන්සරයක් සාදන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">252</span> <span class="n">buffer</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_empty</span><span class="p">((</span><span class="bp">self</span><span class="o">.</span><span class="n">world_size</span> <span class="o">*</span> <span class="nb">sum</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">),))</span></pre></div>
</div>
</div>
<div class='section' id='section-61'>
<div class='docs'>
<div class='section-link'>
<a href='#section-61'>#</a>
</div>
<p>අඛණ්ඩබෆරය නෝඩ් ගණනට බෙදන්න. මෙම බෙදීම් `බෆර්' පිළිබඳ අදහස් වේ. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">254</span> <span class="n">buffers</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">buffer</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="nb">sum</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">)))</span></pre></div>
</div>
</div>
<div class='section' id='section-62'>
<div class='docs'>
<div class='section-link'>
<a href='#section-62'>#</a>
</div>
<p>පුහුණුකළ හැකි සහ ස්ථාවර කුට්ටි දෙකම සංයුක්ත කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">257</span> <span class="n">chunk</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">cat</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk</span><span class="p">,</span> <span class="n">dim</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-63'>
<div class='docs'>
<div class='section-link'>
<a href='#section-63'>#</a>
</div>
<p>සියලුමනෝඩ්/උපාංග වලින් පරාමිතීන් එක්රැස් කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">260</span> <span class="n">dist</span><span class="o">.</span><span class="n">all_gather</span><span class="p">(</span><span class="n">buffers</span><span class="p">,</span> <span class="n">chunk</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-64'>
<div class='docs'>
<div class='section-link'>
<a href='#section-64'>#</a>
</div>
<p>රැස්කරන ලද පරාමිතීන් පුහුණු කළ හැකි සහ ස්ථාවර කුට්ටි වලට බෙදන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">263</span> <span class="n">params</span> <span class="o">=</span> <span class="n">buffer</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="nb">sum</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">))</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">,</span> <span class="n">dim</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-65'>
<div class='docs'>
<div class='section-link'>
<a href='#section-65'>#</a>
</div>
<p>එක්රැස්කිරීමේ මෙහෙයුම සම්පූර්ණ වන තෙක් රැඳී සිටින්න, පසුව අවරෝධක වෙත යොමු කිරීම් ඉවත් කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">265</span> <span class="n">buffer</span><span class="o">.</span><span class="n">record_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span>
<span class="lineno">266</span> <span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="n">buffers</span><span class="p">:</span>
<span class="lineno">267</span> <span class="n">b</span><span class="o">.</span><span class="n">record_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span>
<span class="lineno">268</span> <span class="n">buffer</span><span class="o">.</span><span class="n">record_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span>
<span class="lineno">269</span> <span class="k">del</span> <span class="n">buffer</span>
<span class="lineno">270</span> <span class="k">del</span> <span class="n">buffers</span></pre></div>
</div>
</div>
<div class='section' id='section-66'>
<div class='docs'>
<div class='section-link'>
<a href='#section-66'>#</a>
</div>
<p>පුහුණුකළ හැකි සහ ස්ථාවර පරාමිතීන් අඛණ්ඩ ආතතීන්ට නැවත සකස් කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">273</span> <span class="n">params</span> <span class="o">=</span> <span class="p">[</span><span class="n">p</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">params</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-67'>
<div class='docs'>
<div class='section-link'>
<a href='#section-67'>#</a>
</div>
<p>තනිපරාමිති ආතතීන් එකතු කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">276</span> <span class="k">for</span> <span class="n">cont</span><span class="p">,</span> <span class="n">ps</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">params</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">param_refs</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-68'>
<div class='docs'>
<div class='section-link'>
<a href='#section-68'>#</a>
</div>
<p>පරාමිතීන්නොමැති නම්, මඟ හරින්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">278</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">ps</span><span class="p">:</span>
<span class="lineno">279</span> <span class="k">continue</span></pre></div>
</div>
</div>
<div class='section' id='section-69'>
<div class='docs'>
<div class='section-link'>
<a href='#section-69'>#</a>
</div>
<p>අඛණ්ඩආතතියෙන් ඕෆ්සෙට් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">282</span> <span class="n">offset</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
</div>
</div>
<div class='section' id='section-70'>
<div class='docs'>
<div class='section-link'>
<a href='#section-70'>#</a>
</div>
<p>ආදර්ශපරාමිතීන් හරහා නැවත නැවතත් සහ අඛණ්ඩ tensor සිට අගයන් පැවරීම </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">284</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">ps</span><span class="p">:</span></pre></div>
</div>
</div>
<div class='section' id='section-71'>
<div class='docs'>
<div class='section-link'>
<a href='#section-71'>#</a>
</div>
<p>මුල්පරාමිති හැඩය </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">286</span> <span class="n">shape</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">_orig_shape</span> <span class="c1"># type: ignore[attr-defined]</span></pre></div>
</div>
</div>
<div class='section' id='section-72'>
<div class='docs'>
<div class='section-link'>
<a href='#section-72'>#</a>
</div>
<p>පරාමිතියේගබඩා ප්රමාණය වෙනස් කරන්න. අපි පරාමිතීන් පිරිසිදු කළ <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord coloredeq eqh" style=""><span class="mord" style="">0</span></span></span></span></span></span> විට මෙය සකස් කරන ලදී. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">288</span> <span class="n">p</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">storage</span><span class="p">()</span><span class="o">.</span><span class="n">resize_</span><span class="p">(</span><span class="n">shape</span><span class="o">.</span><span class="n">numel</span><span class="p">())</span></pre></div>
</div>
</div>
<div class='section' id='section-73'>
<div class='docs'>
<div class='section-link'>
<a href='#section-73'>#</a>
</div>
<p>අඛණ්ඩආතතියෙන් අගයන් පැවරීම </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">290</span> <span class="n">p</span><span class="o">.</span><span class="n">data</span><span class="p">[:]</span> <span class="o">=</span> <span class="n">cont</span><span class="p">[</span><span class="n">offset</span><span class="p">:</span> <span class="n">offset</span> <span class="o">+</span> <span class="n">shape</span><span class="o">.</span><span class="n">numel</span><span class="p">()]</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">shape</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-74'>
<div class='docs'>
<div class='section-link'>
<a href='#section-74'>#</a>
</div>
<p>වෙනත්මෙහෙයුම් සිදු කිරීමට පෙර මෙහෙයුම් සම්පූර්ණ වන තෙක් රැඳී සිටින්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">292</span> <span class="n">p</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">record_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-75'>
<div class='docs'>
<div class='section-link'>
<a href='#section-75'>#</a>
</div>
<p>ඕෆ්සෙට්යාවත්කාලීන කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">294</span> <span class="n">offset</span> <span class="o">+=</span> <span class="n">shape</span><span class="o">.</span><span class="n">numel</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-76'>
<div class='docs'>
<div class='section-link'>
<a href='#section-76'>#</a>
</div>
<p>වෙනත්මෙහෙයුම් සිදු කිරීමට පෙර මෙහෙයුම සම්පූර්ණ වන තෙක් රැඳී සිටින්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">297</span> <span class="n">cont</span><span class="o">.</span><span class="n">record_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-77'>
<div class='docs'>
<div class='section-link'>
<a href='#section-77'>#</a>
</div>
<p> </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">300</span> <span class="k">del</span> <span class="n">params</span></pre></div>
</div>
</div>
<div class='section' id='section-78'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-78'>#</a>
</div>
<h3>ඉදිරිසාමාර්ථය</h3>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">302</span> <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-79'>
<div class='docs'>
<div class='section-link'>
<a href='#section-79'>#</a>
</div>
<p>වත්මන්නෝඩයේ සියලු පරාමිතීන් ලබා ගන්න. මෙය පෙර ස්තරය මගින් කැඳවනු ලැබේ, එබැවින් මෙම ඇමතුම පරාමිතීන් ලබා ගත හැකි බවට වග බලා ගැනීම සඳහා පමණි. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">309</span> <span class="bp">self</span><span class="o">.</span><span class="n">fetch_params</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-80'>
<div class='docs'>
<div class='section-link'>
<a href='#section-80'>#</a>
</div>
<p>පරාමිතියසම්පූර්ණ කිරීම සඳහා රැඳී සිටින්න. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">312</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">current_stream</span><span class="p">()</span><span class="o">.</span><span class="n">wait_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-81'>
<div class='docs'>
<div class='section-link'>
<a href='#section-81'>#</a>
</div>
<p>ඉදිරියටයන ස්ථරවල පරාමිතීන් ලබා ගැනීම ආරම්භ කරන්න, එවිට වත්මන් ස්තරය එහි ගණනය කිරීම් සිදු කරන ඒවා ලබා ගනු ඇත. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">316</span> <span class="k">for</span> <span class="n">layer</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">next_layer</span><span class="p">:</span>
<span class="lineno">317</span> <span class="n">layer</span><span class="o">.</span><span class="n">fetch_params</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-82'>
<div class='docs'>
<div class='section-link'>
<a href='#section-82'>#</a>
</div>
<p>ඔටෝග්රාඩ්සක්රීය කර ඇත්නම් වත්මන් ස්ථරයේ පරාමිතීන් සඳහා පසුගාමී කොකු එකතු කරන්න. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">320</span> <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">is_grad_enabled</span><span class="p">():</span>
<span class="lineno">321</span> <span class="bp">self</span><span class="o">.</span><span class="n">_add_backward_hooks</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-83'>
<div class='docs'>
<div class='section-link'>
<a href='#section-83'>#</a>
</div>
<p>වත්මන්ස්ථරයේ ප්රතිදානයන් ගණනය කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">324</span> <span class="n">res</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">module</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-84'>
<div class='docs'>
<div class='section-link'>
<a href='#section-84'>#</a>
</div>
<p>ස්ථරයේපරාමිතීන් පිරිසිදු කරන්න. </p>
<p><em>ඔටෝග්රාඩ්සක්රීය කර ඇත්නම් පිරිසිදු කිරීම මඟ හරින්න, මෙය ජාලයේ අවසාන ස්තරය වේ, මන්ද පසුගාමී පාස් සඳහා අපට නැවත පරාමිතීන් ලබා ගැනීමට අවශ්ය වනු ඇත. </em> </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">330</span> <span class="k">if</span> <span class="ow">not</span> <span class="n">torch</span><span class="o">.</span><span class="n">is_grad_enabled</span><span class="p">()</span> <span class="ow">or</span> <span class="bp">self</span><span class="o">.</span><span class="n">next_layer</span><span class="p">:</span>
<span class="lineno">331</span> <span class="bp">self</span><span class="o">.</span><span class="n">_cleanup_params</span><span class="p">()</span>
<span class="lineno">332</span>
<span class="lineno">333</span> <span class="k">return</span> <span class="n">res</span></pre></div>
</div>
</div>
<div class='section' id='section-85'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-85'>#</a>
</div>
<h4>වත්මන්ස්ථරයේ පරාමිතීන් සඳහා පසුගාමී කොකු එකතු කරන්න. </h4>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">335</span> <span class="k">def</span> <span class="nf">_add_backward_hooks</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-86'>
<div class='docs'>
<div class='section-link'>
<a href='#section-86'>#</a>
</div>
<p>පසුගාමීකොකු ගණන එකතු කරන ලදි </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">341</span> <span class="bp">self</span><span class="o">.</span><span class="n">_backward_hook_handles</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
</div>
</div>
<div class='section' id='section-87'>
<div class='docs'>
<div class='section-link'>
<a href='#section-87'>#</a>
</div>
<p>වත්මන්ස්ථරයේ පුහුණු කළ හැකි පරාමිතීන් හරහා ලූප් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">344</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">param_refs</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">]:</span></pre></div>
</div>
</div>
<div class='section' id='section-88'>
<div class='docs'>
<div class='section-link'>
<a href='#section-88'>#</a>
</div>
<p>කොක්කක්දැනටමත් එකතු කර නොමැති බවට වග බලා ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">346</span> <span class="k">assert</span> <span class="ow">not</span> <span class="nb">hasattr</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="s2">&quot;_hook_handle&quot;</span><span class="p">),</span> <span class="s1">&#39;Parameter has already been hooked&#39;</span></pre></div>
</div>
</div>
<div class='section' id='section-89'>
<div class='docs'>
<div class='section-link'>
<a href='#section-89'>#</a>
</div>
<p>අපටබාධා කළ හැකි ස්වයංක්රීය පියවරක් නිර්මාණය <code class="highlight"><span></span><span class="n">expand_as</span></code>
කිරීමට භාවිතා කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">348</span> <span class="n">p_tmp</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">expand_as</span><span class="p">(</span><span class="n">p</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-90'>
<div class='docs'>
<div class='section-link'>
<a href='#section-90'>#</a>
</div>
<p>පසුගාමීකොක්ක එක් කිරීමට හසුරුව ලබා ගන්න. <a href="https://amsword.medium.com/understanding-pytorchs-autograd-with-grad-fn-and-next-functions-b2c4836daa00">මෙම බ්ලොග් අඩවිය ගැන සාකච්ඡා කරයි <code class="highlight"><span></span><span class="n">grad_acc</span></code>
</a>. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">351</span> <span class="n">grad_acc</span> <span class="o">=</span> <span class="n">p_tmp</span><span class="o">.</span><span class="n">grad_fn</span><span class="o">.</span><span class="n">next_functions</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span></pre></div>
</div>
</div>
<div class='section' id='section-91'>
<div class='docs'>
<div class='section-link'>
<a href='#section-91'>#</a>
</div>
<p>පසුගාමීකොක්ක එක් කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">353</span> <span class="n">handle</span> <span class="o">=</span> <span class="n">grad_acc</span><span class="o">.</span><span class="n">register_hook</span><span class="p">(</span>
<span class="lineno">354</span> <span class="n">functools</span><span class="o">.</span><span class="n">partial</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">_post_backward_hook</span><span class="p">,</span> <span class="n">p</span><span class="p">))</span></pre></div>
</div>
</div>
<div class='section' id='section-92'>
<div class='docs'>
<div class='section-link'>
<a href='#section-92'>#</a>
</div>
<p>හසුරුවගැන සඳහනක් තබා ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">356</span> <span class="n">p</span><span class="o">.</span><span class="n">_hook_handle</span> <span class="o">=</span> <span class="n">handle</span></pre></div>
</div>
</div>
<div class='section' id='section-93'>
<div class='docs'>
<div class='section-link'>
<a href='#section-93'>#</a>
</div>
<p>එකතුකරන ලද කොකු ගණන වැඩි කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">358</span> <span class="bp">self</span><span class="o">.</span><span class="n">_backward_hook_handles</span> <span class="o">+=</span> <span class="mi">1</span></pre></div>
</div>
</div>
<div class='section' id='section-94'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-94'>#</a>
</div>
<h4>පසුගාමීසිදුවීමක් හසුරුවන්න</h4>
<p>පරාමිතියපසුගාමී කොකු සහ මොඩියුලයේ පසුගාමී කොක්ක මගින් මෙය කැඳවනු ලැබේ. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">360</span> <span class="k">def</span> <span class="nf">_backward_event</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-95'>
<div class='docs'>
<div class='section-link'>
<a href='#section-95'>#</a>
</div>
<p>කොකුකවුන්ටරය අඩු කිරීම </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">368</span> <span class="bp">self</span><span class="o">.</span><span class="n">_backward_hook_handles</span> <span class="o">-=</span> <span class="mi">1</span></pre></div>
</div>
</div>
<div class='section' id='section-96'>
<div class='docs'>
<div class='section-link'>
<a href='#section-96'>#</a>
</div>
<p>සියලුමකොකු (මොඩියුල කොක්ක ඇතුළුව) කැඳවා ඇත්නම්, අපට අනුක්රමික උපස්ථ කර පරාමිතීන් පිරිසිදු කළ හැකිය. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">372</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">_backward_hook_handles</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span><span class="p">:</span>
<span class="lineno">373</span> <span class="bp">self</span><span class="o">.</span><span class="n">_backup_grads</span><span class="p">()</span>
<span class="lineno">374</span> <span class="bp">self</span><span class="o">.</span><span class="n">_cleanup_params</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-97'>
<div class='docs'>
<div class='section-link'>
<a href='#section-97'>#</a>
</div>
<p>පෙරස්ථරයේ පරාමිතීන් ලබා ගැනීම ආරම්භ කරන්න, මන්ද ඔටෝග්රාඩ් ඊළඟට එහි අනුක්රමික ක්රියාවලිය කරනු ඇත. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">377</span> <span class="k">for</span> <span class="n">layer</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">prev_layer</span><span class="p">:</span>
<span class="lineno">378</span> <span class="n">layer</span><span class="o">.</span><span class="n">fetch_params</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-98'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-98'>#</a>
</div>
<h4>පරාමිතියපසුගාමී කොක්ක</h4>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">380</span> <span class="k">def</span> <span class="nf">_post_backward_hook</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">p</span><span class="p">:</span> <span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-99'>
<div class='docs'>
<div class='section-link'>
<a href='#section-99'>#</a>
</div>
<p>පරාමිතියෙන්හසුරුව ඉවත් කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">385</span> <span class="n">p</span><span class="o">.</span><span class="n">_hook_handle</span><span class="o">.</span><span class="n">remove</span><span class="p">()</span> <span class="c1"># type: ignore[attr-defined]</span>
<span class="lineno">386</span> <span class="nb">delattr</span><span class="p">(</span><span class="n">p</span><span class="p">,</span> <span class="s2">&quot;_hook_handle&quot;</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-100'>
<div class='docs'>
<div class='section-link'>
<a href='#section-100'>#</a>
</div>
<p>පසුගාමීසිදුවීමක් හසුරුවන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">389</span> <span class="bp">self</span><span class="o">.</span><span class="n">_backward_event</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-101'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-101'>#</a>
</div>
<h4>මොඩියුලයපසුගාමී කොක්ක</h4>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">391</span> <span class="k">def</span> <span class="nf">_backward_hook</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-102'>
<div class='docs'>
<div class='section-link'>
<a href='#section-102'>#</a>
</div>
<p>පසුගාමීසිදුවීමක් හසුරුවන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">396</span> <span class="bp">self</span><span class="o">.</span><span class="n">_backward_event</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-103'>
<div class='docs'>
<div class='section-link'>
<a href='#section-103'>#</a>
</div>
<p>පෙරස්ථරය පරිගණක අනුක්රමික ආරම්භ වනු ඇත. එය පරාම් ලබා ගැනීම අවසන් කර ඇති බවට අප වග බලා ගත යුතුය. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">399</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">current_stream</span><span class="p">()</span><span class="o">.</span><span class="n">wait_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-104'>
<div class='docs'>
<div class='section-link'>
<a href='#section-104'>#</a>
</div>
<p> </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">402</span> <span class="k">return</span> <span class="kc">None</span></pre></div>
</div>
</div>
<div class='section' id='section-105'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-105'>#</a>
</div>
<h3>වත්මන්ස්ථරයේ අනුක්රමික උපස්ථ කරන්න</h3>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">404</span> <span class="nd">@torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">()</span>
<span class="lineno">405</span> <span class="k">def</span> <span class="nf">_backup_grads</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-106'>
<div class='docs'>
<div class='section-link'>
<a href='#section-106'>#</a>
</div>
<p>පුහුණුකළ හැකි පරාමිතීන් නොමැති නම් මඟ හරින්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">410</span> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">]</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="lineno">411</span> <span class="k">return</span></pre></div>
</div>
</div>
<div class='section' id='section-107'>
<div class='docs'>
<div class='section-link'>
<a href='#section-107'>#</a>
</div>
<p>අනුක්රමිකඋපස්ථ කිරීමට උපස්ථ ධාරාව භාවිතා කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">414</span> <span class="k">with</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">backup_stream</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-108'>
<div class='docs'>
<div class='section-link'>
<a href='#section-108'>#</a>
</div>
<p>අනුක්රමිකගබඩා කිරීමට බෆරය </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">416</span> <span class="n">buffer</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_empty</span><span class="p">((</span><span class="bp">self</span><span class="o">.</span><span class="n">world_size</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">],))</span></pre></div>
</div>
</div>
<div class='section' id='section-109'>
<div class='docs'>
<div class='section-link'>
<a href='#section-109'>#</a>
</div>
<p>අඛණ්ඩබෆරය නෝඩ් ගණනකට බෙදන්න. මෙම බෙදීම් `බෆර්' පිළිබඳ අදහස් වේ. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">418</span> <span class="n">buffers</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">buffer</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">]))</span></pre></div>
</div>
</div>
<div class='section' id='section-110'>
<div class='docs'>
<div class='section-link'>
<a href='#section-110'>#</a>
</div>
<p>අඛණ්ඩබෆරයේ ඕෆ්සෙට් </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">421</span> <span class="n">offset</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
</div>
</div>
<div class='section' id='section-111'>
<div class='docs'>
<div class='section-link'>
<a href='#section-111'>#</a>
</div>
<p>පුහුණුකළ හැකි පරාමිතීන් හරහා නැවත ක්රියාත්මක කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">423</span> <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">param_refs</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">]:</span></pre></div>
</div>
</div>
<div class='section' id='section-112'>
<div class='docs'>
<div class='section-link'>
<a href='#section-112'>#</a>
</div>
<p>අනුක්රමිකඑකතු </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">425</span> <span class="n">shape</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">_orig_shape</span> <span class="c1"># type: ignore[attr-defined]</span>
<span class="lineno">426</span> <span class="n">buffer</span><span class="p">[</span><span class="n">offset</span><span class="p">:</span> <span class="n">offset</span> <span class="o">+</span> <span class="n">shape</span><span class="o">.</span><span class="n">numel</span><span class="p">()]</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">grad</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-113'>
<div class='docs'>
<div class='section-link'>
<a href='#section-113'>#</a>
</div>
<p>ඕෆ්සෙට්යාවත්කාලීන කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">428</span> <span class="n">offset</span> <span class="o">+=</span> <span class="n">shape</span><span class="o">.</span><span class="n">numel</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-114'>
<div class='docs'>
<div class='section-link'>
<a href='#section-114'>#</a>
</div>
<p>අනුක්රමිකපිරිසිදු </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">430</span> <span class="n">p</span><span class="o">.</span><span class="n">grad</span> <span class="o">=</span> <span class="kc">None</span></pre></div>
</div>
</div>
<div class='section' id='section-115'>
<div class='docs'>
<div class='section-link'>
<a href='#section-115'>#</a>
</div>
<p>වත්මන්ෂාර්ඩ් වල අනුක්රමික සමුච්චය කිරීම සඳහා හිස් ටෙන්සරය </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">433</span> <span class="n">grad</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">_empty</span><span class="p">((</span><span class="bp">self</span><span class="o">.</span><span class="n">chunk_size</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">],))</span></pre></div>
</div>
</div>
<div class='section' id='section-116'>
<div class='docs'>
<div class='section-link'>
<a href='#section-116'>#</a>
</div>
<p>එක්එක් ශාකයේ අනුක්රමික සමුච්චය කරන්න. එය නෝඩ් හරහා බෆර විසිරී යන අතර සෑම නෝඩයක්ම එයට ලැබෙන ආතතීන් එකතු වේ (අඩු කරයි). </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">436</span> <span class="n">dist</span><span class="o">.</span><span class="n">reduce_scatter</span><span class="p">(</span><span class="n">grad</span><span class="p">,</span> <span class="n">buffers</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-117'>
<div class='docs'>
<div class='section-link'>
<a href='#section-117'>#</a>
</div>
<p>මෙහෙයුමසම්පූර්ණ වන තෙක් රැඳී සිටින්න, පසුව අවරෝධක වෙත යොමු කිරීම් ඉවත් කරන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">439</span> <span class="k">for</span> <span class="n">b</span> <span class="ow">in</span> <span class="n">buffers</span><span class="p">:</span>
<span class="lineno">440</span> <span class="n">b</span><span class="o">.</span><span class="n">record_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span>
<span class="lineno">441</span> <span class="n">buffer</span><span class="o">.</span><span class="n">record_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span><span class="p">)</span>
<span class="lineno">442</span> <span class="k">del</span> <span class="n">buffer</span>
<span class="lineno">443</span> <span class="k">del</span> <span class="n">buffers</span></pre></div>
</div>
</div>
<div class='section' id='section-118'>
<div class='docs'>
<div class='section-link'>
<a href='#section-118'>#</a>
</div>
<p>කුට්ටිඅනුක්රමික සකසන්න. ප්රශස්තකරණය දකින්නේ මෙයයි. </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">446</span> <span class="bp">self</span><span class="o">.</span><span class="n">chunk</span><span class="p">[</span><span class="bp">self</span><span class="o">.</span><span class="n">TRAINING_PARAMS_IDX</span><span class="p">]</span><span class="o">.</span><span class="n">grad</span> <span class="o">=</span> <span class="n">grad</span>
<span class="lineno">447</span> <span class="k">del</span> <span class="n">grad</span></pre></div>
</div>
</div>
<div class='section' id='section-119'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-119'>#</a>
</div>
<h2><code class="highlight"><span></span><span class="n">Zero3Layer</span></code>
ස්ථර සඳහා අනුක්රමික මොඩියුලය</h2>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">450</span><span class="k">class</span> <span class="nc">Zero3Sequential</span><span class="p">(</span><span class="n">nn</span><span class="o">.</span><span class="n">Module</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-120'>
<div class='docs doc-strings'>
<div class='section-link'>
<a href='#section-120'>#</a>
</div>
<ul><li><code class="highlight"><span></span><span class="n">modules</span></code>
<code class="highlight"><span></span><span class="n">Zero3Layer</span></code>
ස්ථර ලැයිස්තුව</li></ul>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">454</span> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">modules</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">Zero3Layer</span><span class="p">]):</span></pre></div>
</div>
</div>
<div class='section' id='section-121'>
<div class='docs'>
<div class='section-link'>
<a href='#section-121'>#</a>
</div>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">458</span> <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-122'>
<div class='docs'>
<div class='section-link'>
<a href='#section-122'>#</a>
</div>
<p>පරාමිතීන්ලබා ගැනීමට CUDA ධාරාව </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">461</span> <span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">Stream</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-123'>
<div class='docs'>
<div class='section-link'>
<a href='#section-123'>#</a>
</div>
<p>උපස්ථකිරීමට CUDA ධාරාව (සමුච්චය) අනුක්රමික </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">463</span> <span class="bp">self</span><span class="o">.</span><span class="n">backup_stream</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">Stream</span><span class="p">()</span></pre></div>
</div>
</div>
<div class='section' id='section-124'>
<div class='docs'>
<div class='section-link'>
<a href='#section-124'>#</a>
</div>
<p>එක්එක් <code class="highlight"><span></span><span class="n">Zero3Layer</span></code>
ස්ථරයක් සඳහා ධාරාවන් සහ පෙර සහ ඉදිරියට යන ස්ථර සකසන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">466</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">modules</span><span class="p">)):</span></pre></div>
</div>
</div>
<div class='section' id='section-125'>
<div class='docs'>
<div class='section-link'>
<a href='#section-125'>#</a>
</div>
<p>ස්ථරදර්ශකය සකසන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">468</span> <span class="n">modules</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">layer_idx</span> <span class="o">=</span> <span class="n">i</span></pre></div>
</div>
</div>
<div class='section' id='section-126'>
<div class='docs'>
<div class='section-link'>
<a href='#section-126'>#</a>
</div>
<p>ධාරාවන්සකසන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">470</span> <span class="n">modules</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">fetch_stream</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">fetch_stream</span>
<span class="lineno">471</span> <span class="n">modules</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">backup_stream</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">backup_stream</span></pre></div>
</div>
</div>
<div class='section' id='section-127'>
<div class='docs'>
<div class='section-link'>
<a href='#section-127'>#</a>
</div>
<p>ඉදිරියටයන ස්ථර සකසන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">473</span> <span class="k">if</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span> <span class="o">&lt;</span> <span class="nb">len</span><span class="p">(</span><span class="n">modules</span><span class="p">):</span>
<span class="lineno">474</span> <span class="n">modules</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">next_layer</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">modules</span><span class="p">[</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">])</span></pre></div>
</div>
</div>
<div class='section' id='section-128'>
<div class='docs'>
<div class='section-link'>
<a href='#section-128'>#</a>
</div>
<p>පෙරස්ථර සකසන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">476</span> <span class="k">if</span> <span class="n">i</span> <span class="o">-</span> <span class="mi">1</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">:</span>
<span class="lineno">477</span> <span class="n">modules</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">prev_layer</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">modules</span><span class="p">[</span><span class="n">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">])</span></pre></div>
</div>
</div>
<div class='section' id='section-129'>
<div class='docs'>
<div class='section-link'>
<a href='#section-129'>#</a>
</div>
<p>මොඩියුලගබඩා ලැයිස්තුව </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">480</span> <span class="bp">self</span><span class="o">.</span><span class="n">module_list</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">ModuleList</span><span class="p">(</span><span class="n">modules</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-130'>
<div class='docs'>
<div class='section-link'>
<a href='#section-130'>#</a>
</div>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">482</span> <span class="k">def</span> <span class="nf">get_trainable_chunk</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-131'>
<div class='docs'>
<div class='section-link'>
<a href='#section-131'>#</a>
</div>
<p>එක්එක් ස්ථරයෙන් පුහුණු කළ හැකි කුට්ටි ලැයිස්තුව ආපසු ලබා දෙන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">484</span> <span class="k">return</span> <span class="nb">sum</span><span class="p">([</span><span class="n">m</span><span class="o">.</span><span class="n">get_trainable_chunk</span><span class="p">()</span> <span class="k">for</span> <span class="n">m</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">module_list</span><span class="p">],</span> <span class="p">[])</span></pre></div>
</div>
</div>
<div class='section' id='section-132'>
<div class='docs'>
<div class='section-link'>
<a href='#section-132'>#</a>
</div>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">486</span> <span class="k">def</span> <span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">x</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">):</span></pre></div>
</div>
</div>
<div class='section' id='section-133'>
<div class='docs'>
<div class='section-link'>
<a href='#section-133'>#</a>
</div>
<p>ආපසුදක්වා ඵලය අනුක්රමික සම්පූර්ණ බවට වග බලා ගන්න </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">488</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">current_stream</span><span class="p">()</span><span class="o">.</span><span class="n">wait_stream</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">backup_stream</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-134'>
<div class='docs'>
<div class='section-link'>
<a href='#section-134'>#</a>
</div>
<p>ඉදිරිසාමාර්ථය </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">491</span> <span class="k">for</span> <span class="n">m</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">module_list</span><span class="p">:</span>
<span class="lineno">492</span> <span class="n">x</span> <span class="o">=</span> <span class="n">m</span><span class="p">(</span><span class="n">x</span><span class="p">)</span></pre></div>
</div>
</div>
<div class='section' id='section-135'>
<div class='docs'>
<div class='section-link'>
<a href='#section-135'>#</a>
</div>
<p> </p>
</div>
<div class='code'>
<div class="highlight"><pre><span class="lineno">495</span> <span class="k">return</span> <span class="n">x</span></pre></div>
</div>
</div>
<div class='footer'>
<a href="https://papers.labml.ai">Trending Research Papers</a>
<a href="https://labml.ai">labml.ai</a>
</div>
</div>
<script src=../../interactive.js?v=1"></script>
<script>
function handleImages() {
var images = document.querySelectorAll('p>img')
for (var i = 0; i < images.length; ++i) {
handleImage(images[i])
}
}
function handleImage(img) {
img.parentElement.style.textAlign = 'center'
var modal = document.createElement('div')
modal.id = 'modal'
var modalContent = document.createElement('div')
modal.appendChild(modalContent)
var modalImage = document.createElement('img')
modalContent.appendChild(modalImage)
var span = document.createElement('span')
span.classList.add('close')
span.textContent = 'x'
modal.appendChild(span)
img.onclick = function () {
console.log('clicked')
document.body.appendChild(modal)
modalImage.src = img.src
}
span.onclick = function () {
document.body.removeChild(modal)
}
}
handleImages()
</script>
</body>
</html>