✍️ typos

This commit is contained in:
Varuna Jayasiri
2021-02-12 14:55:40 +05:30
parent 948f473ee6
commit 56a460a243
7 changed files with 67 additions and 68 deletions

View File

@ -16,13 +16,13 @@ This is a [PyTorch](https://pytorch.org) implementation of
* Tricky for RNNs. Do you need different normalizations for each step?
* Doesn't work with small batch sizes;
large NLP models are usually trained with small batch sizes.
* Need to compute means and variances across devices in distributed training
* Need to compute means and variances across devices in distributed training.
## Layer Normalization
Layer normalization is a simpler normalization method that works
on a wider range of settings.
Layer normalization transformers the inputs to have zero mean and unit variance
Layer normalization transforms the inputs to have zero mean and unit variance
across the features.
*Note that batch normalization fixes the zero mean and unit variance for each element.*
Layer normalization does it for each batch across all elements.

View File

@ -9,13 +9,13 @@ This is a [PyTorch](https://pytorch.org) implementation of
* Tricky for RNNs. Do you need different normalizations for each step?
* Doesn't work with small batch sizes;
large NLP models are usually trained with small batch sizes.
* Need to compute means and variances across devices in distributed training
* Need to compute means and variances across devices in distributed training.
## Layer Normalization
Layer normalization is a simpler normalization method that works
on a wider range of settings.
Layer normalization transformers the inputs to have zero mean and unit variance
Layer normalization transforms the inputs to have zero mean and unit variance
across the features.
*Note that batch normalization fixes the zero mean and unit variance for each element.*
Layer normalization does it for each batch across all elements.