Commit Graph

  • 4752644737 all comments master Varuna Jayasiri 2025-08-01 15:50:27 +05:30
  • 3dd36b80b3 all comments Varuna Jayasiri 2025-08-01 14:14:00 +05:30
  • 73b9892be6 all comments Varuna Jayasiri 2025-08-01 13:56:39 +05:30
  • eb5c004fac sm scale log2 Varuna Jayasiri 2025-08-01 13:26:48 +05:30
  • 5a8182d21b backward pass formulas Varuna Jayasiri 2025-08-01 13:24:57 +05:30
  • a9b5c923eb backward pass formulas Varuna Jayasiri 2025-07-31 17:15:26 +05:30
  • 0ae6e6ae2a flash comments Varuna Jayasiri 2025-07-31 14:49:37 +05:30
  • 1bc2a69803 flash comments Varuna Jayasiri 2025-07-31 09:53:14 +05:30
  • c4d2e8cd22 docs Varuna Jayasiri 2025-07-31 08:48:07 +05:30
  • 00f8714843 Merge remote-tracking branch 'origin/master' Varuna Jayasiri 2025-07-30 14:22:38 +05:30
  • d7d63e1f83 triton flash wip Varuna Jayasiri 2025-07-30 14:20:28 +05:30
  • 0fcb29d8b1 Update readme.md Varuna Jayasiri 2025-07-20 10:40:58 +05:30
  • ebb94842db cleanup save checkpoint Varuna Jayasiri 2025-07-20 09:13:11 +05:30
  • 5eecda7e28 cleanup log activations Varuna Jayasiri 2025-07-20 09:10:05 +05:30
  • a713c92b82 cleanup hook model outputs Varuna Jayasiri 2025-07-20 09:02:34 +05:30
  • 5bdedcffec remove labml_helpers dep Varuna Jayasiri 2025-07-20 08:56:03 +05:30
  • b1ba92c166 Merge pull request #280 from kommentlezz/patch-1 Varuna Jayasiri 2025-07-18 10:46:35 +05:30
  • 47d4231a73 seq length in rope Varuna Jayasiri 2025-07-18 10:43:38 +05:30
  • f6d77c36b2 cleanup some unused imports Varuna Jayasiri 2025-07-18 10:40:22 +05:30
  • 1b702523b9 remove labml_helpers dependency: replace Module with nn.Module Varuna Jayasiri 2025-07-18 10:32:36 +05:30
  • dd1d51ae82 Fix typo: language #280 Pavel Dmitriev 2024-12-24 16:57:38 +05:00
  • 90e21b5a36 Merge pull request #273 from lakshith-403/LoRA Varuna Jayasiri 2024-08-24 14:48:59 +05:30
  • 309a071b8f add labml_nn install command #273 lakshith 2024-08-24 14:47:21 +05:30
  • f57372033f rename configs to trainer lakshith 2024-08-24 14:45:51 +05:30
  • 6df1d798c0 version Varuna Jayasiri 2024-08-24 14:34:44 +05:30
  • 49ea8f06cb paperswithcode.com list Varuna Jayasiri 2024-08-24 10:56:04 +05:30
  • 3e6a4eca80 LoRA Chinese docs Varuna Jayasiri 2024-08-24 10:52:30 +05:30
  • 5731bff586 LoRA docs Varuna Jayasiri 2024-08-24 10:50:02 +05:30
  • 789c31a669 Merge pull request #272 from lakshith-403/LoRA Varuna Jayasiri 2024-08-24 10:43:19 +05:30
  • 9485eec3c4 fix mapping to match renamed layers #272 lakshith 2024-08-23 22:26:22 +05:30
  • 2fb0e22eb1 typo fix lakshith 2024-08-23 22:26:08 +05:30
  • 24bd64af7c check loaded weights lakshith 2024-08-21 09:54:53 +05:30
  • 9e1b35716d LoRA GPT2 n_heads fix and notes Varuna Jayasiri 2024-08-18 17:04:58 +05:30
  • b260349c68 LoRA GPT2 n_heads fix and notes Varuna Jayasiri 2024-08-18 16:34:13 +05:30
  • 012fc7f0f0 LoRA GPT2 n_heads fix and notes Varuna Jayasiri 2024-08-18 16:25:21 +05:30
  • d5768ba423 LoRA typo fix Varuna Jayasiri 2024-08-18 15:07:56 +05:30
  • 9dd97ff11a LoRA transpose Varuna Jayasiri 2024-08-18 14:37:14 +05:30
  • ce21dcf76c LoRA experiment Varuna Jayasiri 2024-08-18 14:26:33 +05:30
  • cf755ec9e2 Merge pull request #271 from lakshith-403/LoRA Varuna Jayasiri 2024-08-18 14:16:39 +05:30
  • 3349afdcf5 simplify loop def in training #271 lakshith 2024-08-18 01:06:06 +05:30
  • 863772e04a rename layers lakshith 2024-08-18 01:04:04 +05:30
  • f3465ac926 Chineese translation Varuna Jayasiri 2024-08-16 16:35:25 +05:30
  • edf875aa70 LoRA experiment notes Varuna Jayasiri 2024-08-16 16:25:19 +05:30
  • d69f1c1058 LoRA experiment notes Varuna Jayasiri 2024-08-16 16:14:52 +05:30
  • 64959fdff9 Merge pull request #267 from pengchzn/master Varuna Jayasiri 2024-08-16 15:45:01 +05:30
  • 5d384d6be7 Merge pull request #268 from lakshith-403/LoRA Varuna Jayasiri 2024-08-07 09:51:17 +05:30
  • 61d32f4696 create LoRA experiment #268 lakshith 2024-08-07 09:49:35 +05:30
  • 4aa1bdb810 Merge remote-tracking branch 'origin/master' #267 pengchzn 2024-08-06 16:21:51 +08:00
  • dc26e6c06d Fix Chinese typo pengchzn 2024-08-06 16:12:04 +08:00
  • d4af40b595 LoRA notes Varuna Jayasiri 2024-08-03 16:59:15 +05:30
  • eb9337e949 Clean up LoRA Varuna Jayasiri 2024-08-02 15:33:45 +05:30
  • dc4762161d Clean up LoRA Varuna Jayasiri 2024-08-02 15:32:02 +05:30
  • 957ade6d67 Merge pull request #266 from lakshith-403/LoRA Varuna Jayasiri 2024-07-31 21:06:28 +05:30
  • bc32b507ea clear notebook outputs #266 lakshith 2024-07-31 20:39:46 +05:30
  • 77d00f089b Add LoRA to GPT2 lakshith 2024-07-31 18:29:24 +05:30
  • 0f2a9be6d2 training loop lakshith 2024-07-29 23:01:06 +05:30
  • 23b7e2ee8e create experiment notebook and refactoring lakshith 2024-07-29 19:40:39 +05:30
  • c82529ce67 move LoRA to labml.nn lakshith 2024-07-29 11:17:38 +05:30
  • 8e756f292b lora layers Varuna Jayasiri 2024-07-28 11:22:27 +05:30
  • d1e8daa121 replace convo1D layers with linear lakshith 2024-07-28 08:51:03 +05:30
  • 50c3cc4eab keep only required configs lakshith 2024-07-27 22:01:21 +05:30
  • 106e72605d remove droput layers lakshith 2024-07-27 21:30:15 +05:30
  • b3aedf3093 remove gelu custom impl and use pytorch impl lakshith 2024-07-27 21:28:07 +05:30
  • cbc38bb26b GPT 2 implementation lakshith 2024-07-26 09:41:13 +05:30
  • 89a3ae8882 Merge pull request #264 from Seas0/patch-1 Varuna Jayasiri 2024-07-16 11:11:46 +05:30
  • 66e92edb04 Fix typo in Wasserstein GAN #264 Seas0 2024-07-15 13:06:40 +08:00
  • 7d7863c080 Fix Chinese typo pengchzn 2024-07-09 15:31:57 +08:00
  • f6e913eb09 transformer mha chinese translation Varuna Jayasiri 2024-06-27 19:35:37 +05:30
  • d3f0bd305a Merge pull request #259 from pengchzn/master (Transformer MHA Chinese Translation) Varuna Jayasiri 2024-06-27 19:28:45 +05:30
  • e03dbc17b6 Refine Chinese translation #259 pengchzn 2024-06-26 19:03:38 +08:00
  • 1446bb124a Refine Chinese translation pengchzn 2024-06-25 21:49:35 +08:00
  • 730046c9c1 Merge branch 'labmlai:master' into master Peng Chen 2024-06-25 21:49:12 +08:00
  • 391fa39167 cleanup notebooks Varuna Jayasiri 2024-06-24 16:17:09 +05:30
  • 26e64a8827 zh Varuna Jayasiri 2024-06-24 15:59:56 +05:30
  • 20494ae94c fix gae formula Varuna Jayasiri 2024-06-24 15:58:03 +05:30
  • a78ca14532 refine translation of /__init__.zh.json pengchzn 2024-06-23 11:42:50 +08:00
  • 4699c514f5 refine translation of /transformers/__init__.zh.json pengchzn 2024-06-23 11:42:41 +08:00
  • d858f2eec0 remove tranding papers link Varuna Jayasiri 2024-06-21 19:35:22 +05:30
  • 0bb4be3ff9 zh translation Varuna Jayasiri 2024-06-21 19:28:14 +05:30
  • a631e73b42 Merge pull request #258 from pengchzn/master Varuna Jayasiri 2024-06-21 19:20:22 +05:30
  • 7ad78f40a0 Merge branch 'master' into master #258 Varuna Jayasiri 2024-06-21 19:19:11 +05:30
  • bf8a491250 chineese translation Varuna Jayasiri 2024-06-21 19:09:13 +05:30
  • f00ba4a70f paper url fix Varuna Jayasiri 2024-06-21 19:01:16 +05:30
  • df09205605 Refine Chinese translation pengchzn 2024-06-21 13:52:15 +08:00
  • 09d09379c2 fix value pe double rotation Varuna Jayasiri 2024-06-20 12:53:09 +05:30
  • 2236f6383c fix rope test code Varuna Jayasiri 2024-06-20 12:49:27 +05:30
  • cf565bcc1d cleanup Varuna Jayasiri 2024-06-18 11:09:02 +05:30
  • 999f2036a5 RWKV docs Varuna Jayasiri 2024-03-17 17:47:51 +05:30
  • 418d1ec44a RWKV docs Varuna Jayasiri 2024-03-17 17:47:39 +05:30
  • df9e1af615 RWKV docs Varuna Jayasiri 2024-03-17 17:45:08 +05:30
  • 7db6e92376 RWKV (#222) Jacob Hatef 2024-03-17 08:06:15 -04:00
  • 285cb3735b uodate docs Varuna Jayasiri 2024-03-02 14:33:53 +05:30
  • 5ec0f70855 Fix formula typo in Relative MHA (#242) f-hy 2024-03-02 16:49:06 +08:00
  • fea91b9699 Cleanup group norm Cifar experiment (#240) f-hy 2024-03-02 16:47:39 +08:00
  • 84ad3f9783 Update unet.py (#239) fix typo Saqib Ameen 2024-03-02 01:43:34 -07:00
  • a0679ecd90 title Varuna Jayasiri 2024-01-12 13:21:54 +05:30
  • 84fab839c2 fix typo chineese translation Varuna Jayasiri 2024-01-12 13:19:14 +05:30
  • 45dc127061 Merge pull request #235 from qiangxinglin/master Varuna Jayasiri 2024-01-12 13:18:50 +05:30
  • 81cf808d05 rope typo Varuna Jayasiri 2024-01-12 13:17:39 +05:30
  • 083988f411 Merge pull request #232 from Etienne248/patch-1 Varuna Jayasiri 2024-01-12 13:17:19 +05:30