850 Commits

Author SHA1 Message Date
4752644737 all comments 2025-08-01 15:50:27 +05:30
3dd36b80b3 all comments 2025-08-01 14:14:00 +05:30
73b9892be6 all comments 2025-08-01 13:56:39 +05:30
eb5c004fac sm scale log2 2025-08-01 13:26:48 +05:30
5a8182d21b backward pass formulas 2025-08-01 13:24:57 +05:30
a9b5c923eb backward pass formulas 2025-07-31 17:15:26 +05:30
0ae6e6ae2a flash comments 2025-07-31 14:49:37 +05:30
1bc2a69803 flash comments 2025-07-31 09:53:14 +05:30
c4d2e8cd22 docs 2025-07-31 08:48:07 +05:30
00f8714843 Merge remote-tracking branch 'origin/master' 2025-07-30 14:22:38 +05:30
d7d63e1f83 triton flash wip 2025-07-30 14:20:28 +05:30
0fcb29d8b1 Update readme.md 2025-07-20 10:40:58 +05:30
ebb94842db cleanup save checkpoint 2025-07-20 09:13:11 +05:30
5eecda7e28 cleanup log activations 2025-07-20 09:10:05 +05:30
a713c92b82 cleanup hook model outputs 2025-07-20 09:02:34 +05:30
5bdedcffec remove labml_helpers dep 2025-07-20 08:56:03 +05:30
b1ba92c166 Merge pull request #280 from kommentlezz/patch-1
Fix typo: language
2025-07-18 10:46:35 +05:30
47d4231a73 seq length in rope 2025-07-18 10:43:38 +05:30
f6d77c36b2 cleanup some unused imports 2025-07-18 10:40:22 +05:30
1b702523b9 remove labml_helpers dependency: replace Module with nn.Module 2025-07-18 10:32:36 +05:30
dd1d51ae82 Fix typo: language 2024-12-24 16:57:38 +05:00
90e21b5a36 Merge pull request #273 from lakshith-403/LoRA
LoRA rename configs to trainer
2024-08-24 14:48:59 +05:30
309a071b8f add labml_nn install command 2024-08-24 14:47:21 +05:30
f57372033f rename configs to trainer 2024-08-24 14:45:51 +05:30
6df1d798c0 version 2024-08-24 14:34:44 +05:30
49ea8f06cb paperswithcode.com list 2024-08-24 10:56:04 +05:30
3e6a4eca80 LoRA Chinese docs 2024-08-24 10:52:30 +05:30
5731bff586 LoRA docs 2024-08-24 10:50:02 +05:30
789c31a669 Merge pull request #272 from lakshith-403/LoRA
LoRA minor fixes
2024-08-24 10:43:19 +05:30
9485eec3c4 fix mapping to match renamed layers 2024-08-23 22:26:22 +05:30
2fb0e22eb1 typo fix 2024-08-23 22:26:08 +05:30
24bd64af7c check loaded weights 2024-08-21 09:54:53 +05:30
9e1b35716d LoRA GPT2 n_heads fix and notes 2024-08-18 17:04:58 +05:30
b260349c68 LoRA GPT2 n_heads fix and notes 2024-08-18 16:34:13 +05:30
012fc7f0f0 LoRA GPT2 n_heads fix and notes 2024-08-18 16:25:21 +05:30
d5768ba423 LoRA typo fix 2024-08-18 15:07:56 +05:30
9dd97ff11a LoRA transpose 2024-08-18 14:37:14 +05:30
ce21dcf76c LoRA experiment 2024-08-18 14:26:33 +05:30
cf755ec9e2 Merge pull request #271 from lakshith-403/LoRA
LoRA minor updates
2024-08-18 14:16:39 +05:30
3349afdcf5 simplify loop def in training 2024-08-18 01:06:06 +05:30
863772e04a rename layers 2024-08-18 01:04:04 +05:30
f3465ac926 Chineese translation 2024-08-16 16:35:25 +05:30
edf875aa70 LoRA experiment notes 2024-08-16 16:25:19 +05:30
d69f1c1058 LoRA experiment notes 2024-08-16 16:14:52 +05:30
64959fdff9 Merge pull request #267 from pengchzn/master
Refine and fix Chinese typo
2024-08-16 15:45:01 +05:30
5d384d6be7 Merge pull request #268 from lakshith-403/LoRA
LoRA experiment
2024-08-07 09:51:17 +05:30
61d32f4696 create LoRA experiment
- remove global configs
- Do weight loading inside experiment
- remove train and transform notebooks
2024-08-07 09:49:35 +05:30
4aa1bdb810 Merge remote-tracking branch 'origin/master'
# Conflicts:
#	translate_cache/transformers/feed_forward.zh.json
2024-08-06 16:21:51 +08:00
dc26e6c06d Fix Chinese typo 2024-08-06 16:12:04 +08:00
d4af40b595 LoRA notes 2024-08-03 16:59:15 +05:30