|
ebb94842db
|
cleanup save checkpoint
|
2025-07-20 09:13:11 +05:30 |
|
|
5eecda7e28
|
cleanup log activations
|
2025-07-20 09:10:05 +05:30 |
|
|
a713c92b82
|
cleanup hook model outputs
|
2025-07-20 09:02:34 +05:30 |
|
|
5bdedcffec
|
remove labml_helpers dep
|
2025-07-20 08:56:03 +05:30 |
|
|
b1ba92c166
|
Merge pull request #280 from kommentlezz/patch-1
Fix typo: language
|
2025-07-18 10:46:35 +05:30 |
|
|
47d4231a73
|
seq length in rope
|
2025-07-18 10:43:38 +05:30 |
|
|
f6d77c36b2
|
cleanup some unused imports
|
2025-07-18 10:40:22 +05:30 |
|
|
1b702523b9
|
remove labml_helpers dependency: replace Module with nn.Module
|
2025-07-18 10:32:36 +05:30 |
|
|
dd1d51ae82
|
Fix typo: language
|
2024-12-24 16:57:38 +05:00 |
|
|
90e21b5a36
|
Merge pull request #273 from lakshith-403/LoRA
LoRA rename configs to trainer
|
2024-08-24 14:48:59 +05:30 |
|
|
309a071b8f
|
add labml_nn install command
|
2024-08-24 14:47:21 +05:30 |
|
|
f57372033f
|
rename configs to trainer
|
2024-08-24 14:45:51 +05:30 |
|
|
6df1d798c0
|
version
|
2024-08-24 14:34:44 +05:30 |
|
|
49ea8f06cb
|
paperswithcode.com list
|
2024-08-24 10:56:04 +05:30 |
|
|
3e6a4eca80
|
LoRA Chinese docs
|
2024-08-24 10:52:30 +05:30 |
|
|
5731bff586
|
LoRA docs
|
2024-08-24 10:50:02 +05:30 |
|
|
789c31a669
|
Merge pull request #272 from lakshith-403/LoRA
LoRA minor fixes
|
2024-08-24 10:43:19 +05:30 |
|
|
9485eec3c4
|
fix mapping to match renamed layers
|
2024-08-23 22:26:22 +05:30 |
|
|
2fb0e22eb1
|
typo fix
|
2024-08-23 22:26:08 +05:30 |
|
|
24bd64af7c
|
check loaded weights
|
2024-08-21 09:54:53 +05:30 |
|
|
9e1b35716d
|
LoRA GPT2 n_heads fix and notes
|
2024-08-18 17:04:58 +05:30 |
|
|
b260349c68
|
LoRA GPT2 n_heads fix and notes
|
2024-08-18 16:34:13 +05:30 |
|
|
012fc7f0f0
|
LoRA GPT2 n_heads fix and notes
|
2024-08-18 16:25:21 +05:30 |
|
|
d5768ba423
|
LoRA typo fix
|
2024-08-18 15:07:56 +05:30 |
|
|
9dd97ff11a
|
LoRA transpose
|
2024-08-18 14:37:14 +05:30 |
|
|
ce21dcf76c
|
LoRA experiment
|
2024-08-18 14:26:33 +05:30 |
|
|
cf755ec9e2
|
Merge pull request #271 from lakshith-403/LoRA
LoRA minor updates
|
2024-08-18 14:16:39 +05:30 |
|
|
3349afdcf5
|
simplify loop def in training
|
2024-08-18 01:06:06 +05:30 |
|
|
863772e04a
|
rename layers
|
2024-08-18 01:04:04 +05:30 |
|
|
f3465ac926
|
Chineese translation
|
2024-08-16 16:35:25 +05:30 |
|
|
edf875aa70
|
LoRA experiment notes
|
2024-08-16 16:25:19 +05:30 |
|
|
d69f1c1058
|
LoRA experiment notes
|
2024-08-16 16:14:52 +05:30 |
|
|
64959fdff9
|
Merge pull request #267 from pengchzn/master
Refine and fix Chinese typo
|
2024-08-16 15:45:01 +05:30 |
|
|
5d384d6be7
|
Merge pull request #268 from lakshith-403/LoRA
LoRA experiment
|
2024-08-07 09:51:17 +05:30 |
|
|
61d32f4696
|
create LoRA experiment
- remove global configs
- Do weight loading inside experiment
- remove train and transform notebooks
|
2024-08-07 09:49:35 +05:30 |
|
|
4aa1bdb810
|
Merge remote-tracking branch 'origin/master'
# Conflicts:
# translate_cache/transformers/feed_forward.zh.json
|
2024-08-06 16:21:51 +08:00 |
|
|
dc26e6c06d
|
Fix Chinese typo
|
2024-08-06 16:12:04 +08:00 |
|
|
d4af40b595
|
LoRA notes
|
2024-08-03 16:59:15 +05:30 |
|
|
eb9337e949
|
Clean up LoRA
|
2024-08-02 15:33:45 +05:30 |
|
|
dc4762161d
|
Clean up LoRA
|
2024-08-02 15:32:02 +05:30 |
|
|
957ade6d67
|
Merge pull request #266 from lakshith-403/LoRA
|
2024-07-31 21:06:28 +05:30 |
|
|
bc32b507ea
|
clear notebook outputs
|
2024-07-31 20:39:46 +05:30 |
|
|
77d00f089b
|
Add LoRA to GPT2
|
2024-07-31 18:29:24 +05:30 |
|
|
0f2a9be6d2
|
training loop
|
2024-07-29 23:01:06 +05:30 |
|
|
23b7e2ee8e
|
create experiment notebook and refactoring
|
2024-07-29 19:41:24 +05:30 |
|
|
c82529ce67
|
move LoRA to labml.nn
|
2024-07-29 11:17:38 +05:30 |
|
|
8e756f292b
|
lora layers
|
2024-07-28 11:22:27 +05:30 |
|
|
d1e8daa121
|
replace convo1D layers with linear
|
2024-07-28 08:51:03 +05:30 |
|
|
50c3cc4eab
|
keep only required configs
|
2024-07-27 22:01:21 +05:30 |
|
|
106e72605d
|
remove droput layers
|
2024-07-27 21:30:15 +05:30 |
|