9262c57f18
flash attention
2025-08-08 19:57:57 +05:30
3dd36b80b3
all comments
2025-08-01 14:14:00 +05:30
73b9892be6
all comments
2025-08-01 13:56:39 +05:30
eb5c004fac
sm scale log2
2025-08-01 13:26:48 +05:30
5a8182d21b
backward pass formulas
2025-08-01 13:24:57 +05:30
a9b5c923eb
backward pass formulas
2025-07-31 17:15:26 +05:30
0ae6e6ae2a
flash comments
2025-07-31 14:49:37 +05:30
1bc2a69803
flash comments
2025-07-31 09:53:14 +05:30
d7d63e1f83
triton flash wip
2025-07-30 14:20:28 +05:30
ebb94842db
cleanup save checkpoint
2025-07-20 09:13:11 +05:30
5eecda7e28
cleanup log activations
2025-07-20 09:10:05 +05:30
a713c92b82
cleanup hook model outputs
2025-07-20 09:02:34 +05:30
5bdedcffec
remove labml_helpers dep
2025-07-20 08:56:03 +05:30
b1ba92c166
Merge pull request #280 from kommentlezz/patch-1
...
Fix typo: language
2025-07-18 10:46:35 +05:30
47d4231a73
seq length in rope
2025-07-18 10:43:38 +05:30
f6d77c36b2
cleanup some unused imports
2025-07-18 10:40:22 +05:30
1b702523b9
remove labml_helpers dependency: replace Module with nn.Module
2025-07-18 10:32:36 +05:30
dd1d51ae82
Fix typo: language
2024-12-24 16:57:38 +05:00
309a071b8f
add labml_nn install command
2024-08-24 14:47:21 +05:30
f57372033f
rename configs to trainer
2024-08-24 14:45:51 +05:30
5731bff586
LoRA docs
2024-08-24 10:50:02 +05:30
9485eec3c4
fix mapping to match renamed layers
2024-08-23 22:26:22 +05:30
2fb0e22eb1
typo fix
2024-08-23 22:26:08 +05:30
24bd64af7c
check loaded weights
2024-08-21 09:54:53 +05:30
b260349c68
LoRA GPT2 n_heads fix and notes
2024-08-18 16:34:13 +05:30
012fc7f0f0
LoRA GPT2 n_heads fix and notes
2024-08-18 16:25:21 +05:30
d5768ba423
LoRA typo fix
2024-08-18 15:07:56 +05:30
9dd97ff11a
LoRA transpose
2024-08-18 14:37:14 +05:30
ce21dcf76c
LoRA experiment
2024-08-18 14:26:33 +05:30
3349afdcf5
simplify loop def in training
2024-08-18 01:06:06 +05:30
863772e04a
rename layers
2024-08-18 01:04:04 +05:30
d69f1c1058
LoRA experiment notes
2024-08-16 16:14:52 +05:30
61d32f4696
create LoRA experiment
...
- remove global configs
- Do weight loading inside experiment
- remove train and transform notebooks
2024-08-07 09:49:35 +05:30
d4af40b595
LoRA notes
2024-08-03 16:59:15 +05:30
eb9337e949
Clean up LoRA
2024-08-02 15:33:45 +05:30
dc4762161d
Clean up LoRA
2024-08-02 15:32:02 +05:30
bc32b507ea
clear notebook outputs
2024-07-31 20:39:46 +05:30
77d00f089b
Add LoRA to GPT2
2024-07-31 18:29:24 +05:30
0f2a9be6d2
training loop
2024-07-29 23:01:06 +05:30
23b7e2ee8e
create experiment notebook and refactoring
2024-07-29 19:41:24 +05:30
c82529ce67
move LoRA to labml.nn
2024-07-29 11:17:38 +05:30
66e92edb04
Fix typo in Wasserstein GAN
2024-07-15 13:06:40 +08:00
391fa39167
cleanup notebooks
2024-06-24 16:17:09 +05:30
20494ae94c
fix gae formula
2024-06-24 15:58:03 +05:30
09d09379c2
fix value pe double rotation
2024-06-20 12:53:09 +05:30
2236f6383c
fix rope test code
2024-06-20 12:49:27 +05:30
cf565bcc1d
cleanup
2024-06-18 11:09:02 +05:30
418d1ec44a
RWKV docs
2024-03-17 17:47:39 +05:30
df9e1af615
RWKV docs
2024-03-17 17:45:08 +05:30
7db6e92376
RWKV ( #222 )
...
* rwkv-init
* annotations
* Re-added docs
* make dir if not exist
* Add RWKV paper and update doc index
* add train loop
* experiment
---------
Co-authored-by: Jacob Hatef <hatef.4@buckeyemail.buckeyemail.osu.edu>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
2024-03-17 17:36:15 +05:30