internal covariate shift

This commit is contained in:
Varuna Jayasiri
2021-02-15 21:44:58 +05:30
parent fd33edc34f
commit 2e545435e8
2 changed files with 2 additions and 2 deletions

View File

@ -82,7 +82,7 @@ network parameters during training.
For example, let’s say there are two layers $l_1$ and $l_2$.
During the beginning of the training $l_1$ outputs (inputs to $l_2$)
could be in distribution $\mathcal{N}(0.5, 1)$.
Then, after some training steps, it could move to $\mathcal{N}(0.5, 1)$.
Then, after some training steps, it could move to $\mathcal{N}(0.6, 1.5)$.
This is <em>internal covariate shift</em>.</p>
<p>Internal covariate shift will adversely affect training speed because the later layers
($l_2$ in the above example) have to adapt to this shifted distribution.</p>

View File

@ -18,7 +18,7 @@ network parameters during training.
For example, let's say there are two layers $l_1$ and $l_2$.
During the beginning of the training $l_1$ outputs (inputs to $l_2$)
could be in distribution $\mathcal{N}(0.5, 1)$.
Then, after some training steps, it could move to $\mathcal{N}(0.5, 1)$.
Then, after some training steps, it could move to $\mathcal{N}(0.6, 1.5)$.
This is *internal covariate shift*.
Internal covariate shift will adversely affect training speed because the later layers