mirror of
https://github.com/labmlai/annotated_deep_learning_paper_implementations.git
synced 2025-08-14 01:13:00 +08:00
📚 group norm improvements
This commit is contained in:
@ -74,9 +74,9 @@
|
||||
</div>
|
||||
<h1>Group Normalization</h1>
|
||||
<p>This is a <a href="https://pytorch.org">PyTorch</a> implementation of
|
||||
the paper <a href="https://arxiv.org/abs/1803.08494">Group Normalization</a>.</p>
|
||||
<p><a href="../batch_norm/index.html">Batch Normalization</a> works well for sufficiently large batch sizes,
|
||||
but does not perform well for small batch sizes, because it normalizes across the batch.
|
||||
the <a href="https://arxiv.org/abs/1803.08494">Group Normalization</a> paper.</p>
|
||||
<p><a href="../batch_norm/index.html">Batch Normalization</a> works well for large enough batch sizes
|
||||
but not well for small batch sizes, because it normalizes over the batch.
|
||||
Training large models with large batch sizes is not possible due to the memory capacity of the
|
||||
devices.</p>
|
||||
<p>This paper introduces Group Normalization, which normalizes a set of features together as a group.
|
||||
@ -104,7 +104,7 @@ $\mu_i$ and $\sigma_i$ are mean and standard deviation.</p>
|
||||
</p>
|
||||
<p>$\mathcal{S}_i$ is the set of indexes across which the mean and standard deviation
|
||||
are calculated for index $i$.
|
||||
$m$ is the size of the set $\mathcal{S}_i$ which is same for all $i$.</p>
|
||||
$m$ is the size of the set $\mathcal{S}_i$ which is the same for all $i$.</p>
|
||||
<p>The definition of $\mathcal{S}_i$ is different for
|
||||
<a href="../batch_norm/index.html">Batch normalization</a>,
|
||||
<a href="../layer_norm/index.html">Layer normalization</a>, and
|
||||
|
@ -8,10 +8,10 @@ summary: >
|
||||
# Group Normalization
|
||||
|
||||
This is a [PyTorch](https://pytorch.org) implementation of
|
||||
the paper [Group Normalization](https://arxiv.org/abs/1803.08494).
|
||||
the [Group Normalization](https://arxiv.org/abs/1803.08494) paper.
|
||||
|
||||
[Batch Normalization](../batch_norm/index.html) works well for sufficiently large batch sizes,
|
||||
but does not perform well for small batch sizes, because it normalizes across the batch.
|
||||
[Batch Normalization](../batch_norm/index.html) works well for large enough batch sizes
|
||||
but not well for small batch sizes, because it normalizes over the batch.
|
||||
Training large models with large batch sizes is not possible due to the memory capacity of the
|
||||
devices.
|
||||
|
||||
@ -42,7 +42,7 @@ $\mu_i$ and $\sigma_i$ are mean and standard deviation.
|
||||
|
||||
$\mathcal{S}_i$ is the set of indexes across which the mean and standard deviation
|
||||
are calculated for index $i$.
|
||||
$m$ is the size of the set $\mathcal{S}_i$ which is same for all $i$.
|
||||
$m$ is the size of the set $\mathcal{S}_i$ which is the same for all $i$.
|
||||
|
||||
The definition of $\mathcal{S}_i$ is different for
|
||||
[Batch normalization](../batch_norm/index.html),
|
||||
|
@ -1,10 +1,10 @@
|
||||
# [Group Normalization](https://nn.labml.ai/normalization/group_norm/index.html)
|
||||
|
||||
This is a [PyTorch](https://pytorch.org) implementation of
|
||||
the paper [Group Normalization](https://arxiv.org/abs/1803.08494).
|
||||
the [Group Normalization](https://arxiv.org/abs/1803.08494) paper.
|
||||
|
||||
[Batch Normalization](https://nn.labml.ai/normalization/batch_norm/index.html) works well for sufficiently large batch sizes,
|
||||
but does not perform well for small batch sizes, because it normalizes across the batch.
|
||||
[Batch Normalization](https://nn.labml.ai/normalization/batch_norm/index.html) works well for large enough batch sizes
|
||||
but not well for small batch sizes, because it normalizes over the batch.
|
||||
Training large models with large batch sizes is not possible due to the memory capacity of the
|
||||
devices.
|
||||
|
||||
@ -35,7 +35,7 @@ $\mu_i$ and $\sigma_i$ are mean and standard deviation.
|
||||
|
||||
$\mathcal{S}_i$ is the set of indexes across which the mean and standard deviation
|
||||
are calculated for index $i$.
|
||||
$m$ is the size of the set $\mathcal{S}_i$ which is same for all $i$.
|
||||
$m$ is the size of the set $\mathcal{S}_i$ which is the same for all $i$.
|
||||
|
||||
The definition of $\mathcal{S}_i$ is different for
|
||||
[Batch normalization](https://nn.labml.ai/normalization/batch_norm/index.html),
|
||||
|
Reference in New Issue
Block a user