{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "AYV_dMVDxyc2", "pycharm": { "name": "#%% md\n" } }, "source": [ "[![Github](https://img.shields.io/github/stars/labmlai/annotated_deep_learning_paper_implementations?style=social)](https://github.com/labmlai/annotated_deep_learning_paper_implementations)\n", "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/annotated_deep_learning_paper_implementations/blob/master/labml_nn/activations/fta/experiment.ipynb)\n", "\n", "## [Fuzzy Tiling Activations](https://nn.labml.ai/activations/fta/index.html)\n", "\n", "Here we train a transformer that uses [Fuzzy Tiling Activation](https://nn.labml.ai/activations/fta/index.html) in the\n", "[Feed-Forward Network](https://nn.labml.ai/transformers/feed_forward.html).\n", "We use it for a language model and train it on Tiny Shakespeare dataset\n", "for demonstration.\n", "However, this is probably not the ideal task for FTA, and we\n", "believe FTA is more suitable for modeling data with continuous variables." ] }, { "cell_type": "markdown", "metadata": { "id": "AahG_i2y5tY9", "pycharm": { "name": "#%% md\n" } }, "source": [ "### Install the packages" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ZCzmCrAIVg0L", "outputId": "cf107fb2-4d50-4c67-af34-367624553421", "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "!pip install labml-nn --quiet" ] }, { "cell_type": "markdown", "metadata": { "id": "SE2VUQ6L5zxI", "pycharm": { "name": "#%% md\n" } }, "source": [ "### Imports" ] }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "import torch\n", "import torch.nn as nn\n", "\n", "from labml import experiment\n", "from labml.configs import option\n", "from labml_nn.activations.fta.experiment import Configs" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Create an experiment" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "experiment.create(name=\"fta\", writers={'screen'})" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "### Configurations" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "conf = Configs()" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "source": [ "Set experiment configurations and assign a configurations dictionary to override configurations" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%% md\n" } } }, { "cell_type": "code", "execution_count": null, "outputs": [], "source": [ "experiment.configs(conf, {\n", " 'tokenizer': 'character',\n", " 'prompt_separator': '',\n", " 'prompt': 'It is ',\n", " 'text': 'tiny_shakespeare',\n", "\n", " 'seq_len': 256,\n", " 'epochs': 32,\n", " 'batch_size': 16,\n", " 'inner_iterations': 10,\n", "\n", " 'optimizer.optimizer': 'Adam',\n", " 'optimizer.learning_rate': 3e-4,\n", "})" ], "metadata": { "collapsed": false, "pycharm": { "name": "#%%\n" } } }, { "cell_type": "markdown", "metadata": { "id": "EvI7MtgJ61w5", "pycharm": { "name": "#%% md\n" } }, "source": [ "Set PyTorch models for loading and saving" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 255 }, "id": "GDlt7dp-5ALt", "outputId": "e7548e8f-c541-4618-dc5a-1597cae42003", "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "experiment.add_pytorch_models({'model': conf.model})" ] }, { "cell_type": "markdown", "metadata": { "id": "KJZRf8527GxL", "pycharm": { "name": "#%% md\n" } }, "source": [ "### Start the experiment and run the training loop." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 1000 }, "id": "aIAWo7Fw5DR8", "outputId": "db979785-bfe3-4eda-d3eb-8ccbe61053e5", "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# Start the experiment\n", "with experiment.start():\n", " conf.run()" ] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [], "name": "FTA", "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" } }, "nbformat": 4, "nbformat_minor": 4 }