9262c57f18
flash attention
2025-08-08 19:57:57 +05:30
3dd36b80b3
all comments
2025-08-01 14:14:00 +05:30
73b9892be6
all comments
2025-08-01 13:56:39 +05:30
eb5c004fac
sm scale log2
2025-08-01 13:26:48 +05:30
5a8182d21b
backward pass formulas
2025-08-01 13:24:57 +05:30
a9b5c923eb
backward pass formulas
2025-07-31 17:15:26 +05:30
0ae6e6ae2a
flash comments
2025-07-31 14:49:37 +05:30
1bc2a69803
flash comments
2025-07-31 09:53:14 +05:30
d7d63e1f83
triton flash wip
2025-07-30 14:20:28 +05:30
ebb94842db
cleanup save checkpoint
2025-07-20 09:13:11 +05:30
5eecda7e28
cleanup log activations
2025-07-20 09:10:05 +05:30
a713c92b82
cleanup hook model outputs
2025-07-20 09:02:34 +05:30
5bdedcffec
remove labml_helpers dep
2025-07-20 08:56:03 +05:30
47d4231a73
seq length in rope
2025-07-18 10:43:38 +05:30
f6d77c36b2
cleanup some unused imports
2025-07-18 10:40:22 +05:30
1b702523b9
remove labml_helpers dependency: replace Module with nn.Module
2025-07-18 10:32:36 +05:30
dc4762161d
Clean up LoRA
2024-08-02 15:32:02 +05:30
bc32b507ea
clear notebook outputs
2024-07-31 20:39:46 +05:30
77d00f089b
Add LoRA to GPT2
2024-07-31 18:29:24 +05:30
0f2a9be6d2
training loop
2024-07-29 23:01:06 +05:30
23b7e2ee8e
create experiment notebook and refactoring
2024-07-29 19:41:24 +05:30
c82529ce67
move LoRA to labml.nn
2024-07-29 11:17:38 +05:30
391fa39167
cleanup notebooks
2024-06-24 16:17:09 +05:30
09d09379c2
fix value pe double rotation
2024-06-20 12:53:09 +05:30
2236f6383c
fix rope test code
2024-06-20 12:49:27 +05:30
cf565bcc1d
cleanup
2024-06-18 11:09:02 +05:30
5ec0f70855
Fix formula typo in Relative MHA ( #242 )
...
${(\textcolor{lightgreen}{\mathbf{A + C}})}_{i,j} = Q_i^\top K_j + \textcolor{orange}{v^\top} K_j$
2024-03-02 14:19:06 +05:30
bc5565b84c
Fix a typo in the formula of RoPE
2023-12-08 15:50:21 +01:00
36a374ed76
Merge pull request #226 from MrYxJ/patch-1
...
Fix a typo in the formula of ALiBi.
2023-11-17 17:42:17 +00:00
830161b299
Update __init__.py
...
This formula is wrong, there is one symbol '-' missing in front of the 1, which will affect people's understanding when reading. What is expressed here is that the position of the ith token is increasing from -(i-1) to 0, so it should be -1.
2023-11-14 00:30:26 +08:00
4d922e838f
Add backticks to mask'shape
2023-11-10 19:51:37 +08:00
f42c0e9cf4
right shift example comment fix
2023-11-07 09:28:22 +00:00
ffafaf1df7
fix: fix cls_token bug in vit.
2023-11-06 11:28:45 +08:00
9a42ac2697
arxiv.org links
2023-10-24 14:42:32 +01:00
8db330dd22
sophia-g docs
2023-07-14 21:25:08 +05:30
7c02294e7c
sophia exp
2023-07-14 16:44:45 +05:30
2eccd8bec6
Add the missing negative sign in the formula.
2023-06-28 06:52:45 +08:00
c5685c9ffe
remove app.labml.ai links
2023-04-02 12:10:18 +05:30
97e53c0f3d
fix glu variants links
2023-04-02 12:00:23 +05:30
3ec5fa9f3d
fix typo mha
2022-12-24 12:53:36 +00:00
594b525e9d
rm comet
2022-09-07 09:31:38 +05:30
0bfb210671
fix vit pe
2022-08-10 15:56:16 +05:30
72669d0526
ALiBi ( #134 )
2022-07-17 09:28:32 +05:30
b6bef1d2fe
cleanup
2022-07-02 14:31:16 +05:30
ab4264cbda
comet links fix
2022-07-02 14:25:27 +05:30
ee5a34aa59
experiment links transformer
2022-06-28 19:02:20 +05:30
e09ee89f36
Transformer experiment logs ( #130 )
2022-06-27 14:11:44 +05:30
0ce65adf9e
RoPER ( #126 )
2022-06-03 21:29:41 +05:30
6a41c82b30
FTA ( #115 )
2022-05-23 22:26:39 +05:30
0ea2853d28
transformer colab with comet.ml
2022-05-05 16:33:37 +01:00