diff --git a/docs/capsule_networks/index.html b/docs/capsule_networks/index.html index 03e752f0..84cb2c05 100644 --- a/docs/capsule_networks/index.html +++ b/docs/capsule_networks/index.html @@ -77,7 +77,7 @@

Capsule network is a neural network architecture that embeds features as capsules and routes them with a voting mechanism to next layer of capsules.

Unlike in other implementations of models, we’ve included a sample, because -it is difficult to understand some of the concepts with just the modules. +it is difficult to understand some concepts with just the modules. This is the annotated code for a model that uses capsules to classify MNIST dataset

This file holds the implementations of the core modules of Capsule Networks.

I used jindongwang/Pytorch-CapsuleNet to clarify some diff --git a/docs/capsule_networks/readme.html b/docs/capsule_networks/readme.html new file mode 100644 index 00000000..7cb46a86 --- /dev/null +++ b/docs/capsule_networks/readme.html @@ -0,0 +1,126 @@ + + + + + + + + + + + + + + + + + + + + + + + Capsule Networks + + + + + + + + +

+
+
+
+

+ home + capsule_networks +

+

+ + + Github + + Join Slact + + Twitter +

+
+
+
+
+ +

Capsule Networks

+

This is a PyTorch implementation/tutorial of +Dynamic Routing Between Capsules.

+

Capsule network is a neural network architecture that embeds features +as capsules and routes them with a voting mechanism to next layer of capsules.

+

Unlike in other implementations of models, we’ve included a sample, because +it is difficult to understand some concepts with just the modules. +This is the annotated code for a model that uses capsules to classify MNIST dataset

+

This file holds the implementations of the core modules of Capsule Networks.

+

I used jindongwang/Pytorch-CapsuleNet to clarify some +confusions I had with the paper.

+

Here’s a notebook for training a Capsule Network on MNIST dataset.

+

Open In Colab +View Run

+
+
+ +
+
+
+ + + + + + \ No newline at end of file diff --git a/docs/rl/ppo/gae.html b/docs/rl/ppo/gae.html index b806d367..4732b4d6 100644 --- a/docs/rl/ppo/gae.html +++ b/docs/rl/ppo/gae.html @@ -123,7 +123,7 @@ \hat{A_t^{(\infty)}} &= r_t + \gamma r_{t+1} +\gamma^2 r_{t+1} + ... - V(s) \end{align}

-

$\hat{A_t^{(1)}}$ is high bias, low variance whilst +

$\hat{A_t^{(1)}}$ is high bias, low variance, whilst $\hat{A_t^{(\infty)}}$ is unbiased, high variance.

We take a weighted average of $\hat{A_t^{(k)}}$ to balance bias and variance. This is called Generalized Advantage Estimation. diff --git a/docs/rl/ppo/index.html b/docs/rl/ppo/index.html index 85f78aeb..19aaf7c2 100644 --- a/docs/rl/ppo/index.html +++ b/docs/rl/ppo/index.html @@ -76,9 +76,9 @@

This is a PyTorch implementation of Proximal Policy Optimization - PPO.

PPO is a policy gradient method for reinforcement learning. -Simple policy gradient methods one do a single gradient update per sample (or a set of samples). -Doing multiple gradient steps for a singe sample causes problems -because the policy deviates too much producing a bad policy. +Simple policy gradient methods do a single gradient update per sample (or a set of samples). +Doing multiple gradient steps for a single sample causes problems +because the policy deviates too much, producing a bad policy. PPO lets us do multiple gradient updates per sample by trying to keep the policy close to the policy that was used to sample data. It does so by clipping gradient flow if the updated policy @@ -172,7 +172,7 @@ J(\pi_\theta) - J(\pi_{\theta_{OLD}})

Then we assume $d^\pi_\theta(s)$ and $d^\pi_{\theta_{OLD}}(s)$ are similar. The error we introduce to $J(\pi_\theta) - J(\pi_{\theta_{OLD}})$ - by this assumtion is bound by the KL divergence between + by this assumption is bound by the KL divergence between $\pi_\theta$ and $\pi_{\theta_{OLD}}$. Constrained Policy Optimization shows the proof of this. I haven’t read it.

diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 0448a6f9..76bfc8d3 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -659,14 +659,14 @@ https://nn.labml.ai/rl/ppo/index.html - 2021-02-23T16:30:00+00:00 + 2021-03-05T16:30:00+00:00 1.00 https://nn.labml.ai/rl/ppo/gae.html - 2021-01-30T16:30:00+00:00 + 2021-03-05T16:30:00+00:00 1.00 diff --git a/labml_nn/capsule_networks/__init__.py b/labml_nn/capsule_networks/__init__.py index 867824a0..40b70fa9 100644 --- a/labml_nn/capsule_networks/__init__.py +++ b/labml_nn/capsule_networks/__init__.py @@ -16,7 +16,7 @@ Capsule network is a neural network architecture that embeds features as capsules and routes them with a voting mechanism to next layer of capsules. Unlike in other implementations of models, we've included a sample, because -it is difficult to understand some of the concepts with just the modules. +it is difficult to understand some concepts with just the modules. [This is the annotated code for a model that uses capsules to classify MNIST dataset](mnist.html) This file holds the implementations of the core modules of Capsule Networks. diff --git a/labml_nn/capsule_networks/readme.md b/labml_nn/capsule_networks/readme.md new file mode 100644 index 00000000..f144f985 --- /dev/null +++ b/labml_nn/capsule_networks/readme.md @@ -0,0 +1,21 @@ +# [Capsule Networks](https://nn.labml.ai/capsule_networks/index.html) + +This is a [PyTorch](https://pytorch.org) implementation/tutorial of +[Dynamic Routing Between Capsules](https://arxiv.org/abs/1710.09829). + +Capsule network is a neural network architecture that embeds features +as capsules and routes them with a voting mechanism to next layer of capsules. + +Unlike in other implementations of models, we've included a sample, because +it is difficult to understand some concepts with just the modules. +[This is the annotated code for a model that uses capsules to classify MNIST dataset](mnist.html) + +This file holds the implementations of the core modules of Capsule Networks. + +I used [jindongwang/Pytorch-CapsuleNet](https://github.com/jindongwang/Pytorch-CapsuleNet) to clarify some +confusions I had with the paper. + +Here's a notebook for training a Capsule Network on MNIST dataset. + +[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lab-ml/nn/blob/master/labml_nn/capsule_networks/mnist.ipynb) +[![View Run](https://img.shields.io/badge/labml-experiment-brightgreen)](https://app.labml.ai/run/e7c08e08586711ebb3e30242ac1c0002)