This commit is contained in:
Varuna Jayasiri
2022-08-26 18:06:08 +05:30
parent 1dbc2cbc04
commit c0004c9e8e
164 changed files with 3787 additions and 3747 deletions

View File

@ -110,7 +110,7 @@
<div class='section-link'>
<a href='#section-2'>#</a>
</div>
<ul><li><code class="highlight"><span></span><span class="n">seq_len</span></code>
<ul><li><code class="highlight"><span></span><span class="n">seq_len</span></code>
is the sequence length of generated math problems. We fill as many problems as possible upto this length :max_digits: is the maximum number of digits in the operand integers :n_sequences: is the number of sequences per epoch</li></ul>
</div>
@ -160,7 +160,7 @@
<div class='section-link'>
<a href='#section-6'>#</a>
</div>
<p> Generates an integer with <code class="highlight"><span></span><span class="n">n_digit</span></code>
<p> Generates an integer with <code class="highlight"><span></span><span class="n">n_digit</span></code>
number of digits</p>
</div>
@ -190,9 +190,9 @@
<div class='section-link'>
<a href='#section-8'>#</a>
</div>
<p> Generates the workings for <code class="highlight"><span></span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span></code>
. For example for <code class="highlight"><span></span><span class="mi">11</span><span class="o">+</span><span class="mi">29</span></code>
it generates <code class="highlight"><span></span><span class="mf">1e0</span><span class="o">+</span><span class="mf">9e0</span><span class="o">+</span><span class="mf">0e0</span><span class="o">=</span><span class="mf">10e0</span> <span class="mf">1e0</span><span class="o">+</span><span class="mf">2e0</span><span class="o">+</span><span class="mf">1e0</span><span class="o">=</span><span class="mf">4e0</span></code>
<p> Generates the workings for <code class="highlight"><span></span><span class="n">x</span> <span class="o">+</span> <span class="n">y</span></code>
. For example for <code class="highlight"><span></span><span class="mi">11</span><span class="o">+</span><span class="mi">29</span></code>
it generates <code class="highlight"><span></span><span class="mf">1e0</span><span class="o">+</span><span class="mf">9e0</span><span class="o">+</span><span class="mf">0e0</span><span class="o">=</span><span class="mf">10e0</span> <span class="mf">1e0</span><span class="o">+</span><span class="mf">2e0</span><span class="o">+</span><span class="mf">1e0</span><span class="o">=</span><span class="mf">4e0</span></code>
.</p>
</div>

View File

@ -95,8 +95,8 @@
<a href='#section-1'>#</a>
</div>
<h2>Configurations</h2>
<p>This extends from CIFAR 10 dataset configurations from <a href="https://github.com/labmlai/labml/tree/master/helpers"><code class="highlight"><span></span><span class="n">labml_helpers</span></code>
</a> and <a href="mnist.html"><code class="highlight"><span></span><span class="n">MNISTConfigs</span></code>
<p>This extends from CIFAR 10 dataset configurations from <a href="https://github.com/labmlai/labml/tree/master/helpers"><code class="highlight"><span></span><span class="n">labml_helpers</span></code>
</a> and <a href="mnist.html"><code class="highlight"><span></span><span class="n">MNISTConfigs</span></code>
</a>.</p>
</div>
@ -270,7 +270,7 @@
<div class='section-link'>
<a href='#section-14'>#</a>
</div>
<p>5 <span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span><span class="mord">2</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord">2</span></span></span></span> pooling layers will produce a output of size <span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.65952em;vertical-align:0em;"></span><span class="mord">1</span><span class="mspace"> </span><span class="mord mathnormal">t</span><span class="mord mathnormal">im</span><span class="mord mathnormal">es</span><span class="mord">1</span></span></span></span>. CIFAR 10 image size is <span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span><span class="mord">32</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord">32</span></span></span></span> </p>
<p>5 <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span><span class="mord">2</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord">2</span></span></span></span></span> pooling layers will produce a output of size <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.65952em;vertical-align:0em;"></span><span class="mord">1</span><span class="mspace"> </span><span class="mord mathnormal">t</span><span class="mord mathnormal">im</span><span class="mord mathnormal">es</span><span class="mord">1</span></span></span></span></span>. CIFAR 10 image size is <span ><span class="katex"><span aria-hidden="true" class="katex-html"><span class="base"><span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span><span class="mord">32</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:0.64444em;vertical-align:0em;"></span><span class="mord">32</span></span></span></span></span> </p>
</div>
<div class='code'>

View File

@ -828,7 +828,7 @@
</div>
<h3>Basic english tokenizer</h3>
<p>We use character level tokenizer in this experiment. You can switch by setting,</p>
<pre class="highlight lang-"><code><span></span><span class="s1">&#39;tokenizer&#39;</span><span class="p">:</span> <span class="s1">&#39;basic_english&#39;</span><span class="p">,</span></code></pre>
<pre class="highlight lang-"><code><span></span><span class="s1">&#39;tokenizer&#39;</span><span class="p">:</span> <span class="s1">&#39;basic_english&#39;</span><span class="p">,</span></code></pre>
<p>in the configurations dictionary when starting the experiment.</p>
</div>
@ -984,7 +984,7 @@
<a href='#section-72'>#</a>
</div>
<h3>Transpose batch</h3>
<p><code class="highlight"><span></span><span class="n">DataLoader</span></code>
<p><code class="highlight"><span></span><span class="n">DataLoader</span></code>
collects the batches on the first dimension. We need to transpose it to be sequence first.</p>
</div>
@ -1008,7 +1008,7 @@
<div class='section-link'>
<a href='#section-74'>#</a>
</div>
<p>Stack the batch along the second dimension <code class="highlight"><span></span><span class="n">dim</span><span class="o">=</span><span class="mi">1</span></code>
<p>Stack the batch along the second dimension <code class="highlight"><span></span><span class="n">dim</span><span class="o">=</span><span class="mi">1</span></code>
</p>
</div>

View File

@ -584,7 +584,7 @@
</div>
<h3>Basic english tokenizer</h3>
<p>We use character level tokenizer in this experiment. You can switch by setting,</p>
<pre class="highlight lang-"><code><span></span><span class="s1">&#39;tokenizer&#39;</span><span class="p">:</span> <span class="s1">&#39;basic_english&#39;</span><span class="p">,</span></code></pre>
<pre class="highlight lang-"><code><span></span><span class="s1">&#39;tokenizer&#39;</span><span class="p">:</span> <span class="s1">&#39;basic_english&#39;</span><span class="p">,</span></code></pre>
<p>in the configurations dictionary when starting the experiment.</p>
</div>
@ -693,17 +693,17 @@
<div class='section-link'>
<a href='#section-49'>#</a>
</div>
<ul><li><code class="highlight"><span></span><span class="n">tokenizer</span></code>
<ul><li><code class="highlight"><span></span><span class="n">tokenizer</span></code>
is the tokenizer function </li>
<li><code class="highlight"><span></span><span class="n">vocab</span></code>
<li><code class="highlight"><span></span><span class="n">vocab</span></code>
is the vocabulary </li>
<li><code class="highlight"><span></span><span class="n">seq_len</span></code>
<li><code class="highlight"><span></span><span class="n">seq_len</span></code>
is the length of the sequence </li>
<li><code class="highlight"><span></span><span class="n">padding_token</span></code>
is the token used for padding when the <code class="highlight"><span></span><span class="n">seq_len</span></code>
<li><code class="highlight"><span></span><span class="n">padding_token</span></code>
is the token used for padding when the <code class="highlight"><span></span><span class="n">seq_len</span></code>
is larger than the text length </li>
<li><code class="highlight"><span></span><span class="n">classifier_token</span></code>
is the <code class="highlight"><span></span><span class="p">[</span><span class="n">CLS</span><span class="p">]</span></code>
<li><code class="highlight"><span></span><span class="n">classifier_token</span></code>
is the <code class="highlight"><span></span><span class="p">[</span><span class="n">CLS</span><span class="p">]</span></code>
token which we set at end of the input</li></ul>
</div>
@ -731,8 +731,8 @@
<div class='section-link'>
<a href='#section-51'>#</a>
</div>
<ul><li><code class="highlight"><span></span><span class="n">batch</span></code>
is the batch of data collected by the <code class="highlight"><span></span><span class="n">DataLoader</span></code>
<ul><li><code class="highlight"><span></span><span class="n">batch</span></code>
is the batch of data collected by the <code class="highlight"><span></span><span class="n">DataLoader</span></code>
</li></ul>
</div>
@ -745,7 +745,7 @@
<div class='section-link'>
<a href='#section-52'>#</a>
</div>
<p>Input data tensor, initialized with <code class="highlight"><span></span><span class="n">padding_token</span></code>
<p>Input data tensor, initialized with <code class="highlight"><span></span><span class="n">padding_token</span></code>
</p>
</div>
@ -806,7 +806,7 @@
<div class='section-link'>
<a href='#section-57'>#</a>
</div>
<p>Truncate upto <code class="highlight"><span></span><span class="n">seq_len</span></code>
<p>Truncate upto <code class="highlight"><span></span><span class="n">seq_len</span></code>
</p>
</div>
@ -831,7 +831,7 @@
<div class='section-link'>
<a href='#section-59'>#</a>
</div>
<p>Set the final token in the sequence to <code class="highlight"><span></span><span class="p">[</span><span class="n">CLS</span><span class="p">]</span></code>
<p>Set the final token in the sequence to <code class="highlight"><span></span><span class="p">[</span><span class="n">CLS</span><span class="p">]</span></code>
</p>
</div>
@ -857,10 +857,10 @@
<a href='#section-61'>#</a>
</div>
<h3>AG News dataset</h3>
<p>This loads the AG News dataset and the set the values for <code class="highlight"><span></span><span class="n">n_classes</span></code>
, <code class="highlight"><span></span><span class="n">vocab</span></code>
, <code class="highlight"><span></span><span class="n">train_loader</span></code>
, and <code class="highlight"><span></span><span class="n">valid_loader</span></code>
<p>This loads the AG News dataset and the set the values for <code class="highlight"><span></span><span class="n">n_classes</span></code>
, <code class="highlight"><span></span><span class="n">vocab</span></code>
, <code class="highlight"><span></span><span class="n">train_loader</span></code>
, and <code class="highlight"><span></span><span class="n">valid_loader</span></code>
.</p>
</div>
@ -1002,10 +1002,10 @@
<div class='section-link'>
<a href='#section-72'>#</a>
</div>
<p>Return <code class="highlight"><span></span><span class="n">n_classes</span></code>
, <code class="highlight"><span></span><span class="n">vocab</span></code>
, <code class="highlight"><span></span><span class="n">train_loader</span></code>
, and <code class="highlight"><span></span><span class="n">valid_loader</span></code>
<p>Return <code class="highlight"><span></span><span class="n">n_classes</span></code>
, <code class="highlight"><span></span><span class="n">vocab</span></code>
, <code class="highlight"><span></span><span class="n">train_loader</span></code>
, and <code class="highlight"><span></span><span class="n">valid_loader</span></code>
</p>
</div>