64959fdff9
Merge pull request #267 from pengchzn/master
...
Refine and fix Chinese typo
2024-08-16 15:45:01 +05:30
5d384d6be7
Merge pull request #268 from lakshith-403/LoRA
...
LoRA experiment
2024-08-07 09:51:17 +05:30
61d32f4696
create LoRA experiment
...
- remove global configs
- Do weight loading inside experiment
- remove train and transform notebooks
2024-08-07 09:49:35 +05:30
4aa1bdb810
Merge remote-tracking branch 'origin/master'
...
# Conflicts:
# translate_cache/transformers/feed_forward.zh.json
2024-08-06 16:21:51 +08:00
dc26e6c06d
Fix Chinese typo
2024-08-06 16:12:04 +08:00
d4af40b595
LoRA notes
2024-08-03 16:59:15 +05:30
eb9337e949
Clean up LoRA
2024-08-02 15:33:45 +05:30
dc4762161d
Clean up LoRA
2024-08-02 15:32:02 +05:30
957ade6d67
Merge pull request #266 from lakshith-403/LoRA
2024-07-31 21:06:28 +05:30
bc32b507ea
clear notebook outputs
2024-07-31 20:39:46 +05:30
77d00f089b
Add LoRA to GPT2
2024-07-31 18:29:24 +05:30
0f2a9be6d2
training loop
2024-07-29 23:01:06 +05:30
23b7e2ee8e
create experiment notebook and refactoring
2024-07-29 19:41:24 +05:30
c82529ce67
move LoRA to labml.nn
2024-07-29 11:17:38 +05:30
8e756f292b
lora layers
2024-07-28 11:22:27 +05:30
d1e8daa121
replace convo1D layers with linear
2024-07-28 08:51:03 +05:30
50c3cc4eab
keep only required configs
2024-07-27 22:01:21 +05:30
106e72605d
remove droput layers
2024-07-27 21:30:15 +05:30
b3aedf3093
remove gelu custom impl and use pytorch impl
2024-07-27 21:28:07 +05:30
cbc38bb26b
GPT 2 implementation
2024-07-26 09:41:13 +05:30
89a3ae8882
Merge pull request #264 from Seas0/patch-1
2024-07-16 11:11:46 +05:30
66e92edb04
Fix typo in Wasserstein GAN
2024-07-15 13:06:40 +08:00
7d7863c080
Fix Chinese typo
2024-07-09 15:31:57 +08:00
f6e913eb09
transformer mha chinese translation
2024-06-27 19:35:37 +05:30
d3f0bd305a
Merge pull request #259 from pengchzn/master (Transformer MHA Chinese Translation)
...
Refine Chinese translation
2024-06-27 19:28:45 +05:30
e03dbc17b6
Refine Chinese translation
2024-06-26 19:03:38 +08:00
1446bb124a
Refine Chinese translation
2024-06-25 21:49:51 +08:00
730046c9c1
Merge branch 'labmlai:master' into master
2024-06-25 21:49:12 +08:00
391fa39167
cleanup notebooks
2024-06-24 16:17:09 +05:30
26e64a8827
zh
2024-06-24 15:59:56 +05:30
20494ae94c
fix gae formula
2024-06-24 15:58:03 +05:30
a78ca14532
refine translation of /__init__.zh.json
2024-06-23 11:42:50 +08:00
4699c514f5
refine translation of /transformers/__init__.zh.json
2024-06-23 11:42:41 +08:00
d858f2eec0
remove tranding papers link
2024-06-21 19:35:22 +05:30
0bb4be3ff9
zh translation
2024-06-21 19:28:14 +05:30
a631e73b42
Merge pull request #258 from pengchzn/master
...
Refine Chinese translation
2024-06-21 19:20:22 +05:30
7ad78f40a0
Merge branch 'master' into master
2024-06-21 19:19:11 +05:30
bf8a491250
chineese translation
2024-06-21 19:09:13 +05:30
f00ba4a70f
paper url fix
2024-06-21 19:01:16 +05:30
df09205605
Refine Chinese translation
2024-06-21 13:52:15 +08:00
09d09379c2
fix value pe double rotation
2024-06-20 12:53:09 +05:30
2236f6383c
fix rope test code
2024-06-20 12:49:27 +05:30
cf565bcc1d
cleanup
2024-06-18 11:09:02 +05:30
999f2036a5
RWKV docs
2024-03-17 17:47:51 +05:30
418d1ec44a
RWKV docs
2024-03-17 17:47:39 +05:30
df9e1af615
RWKV docs
2024-03-17 17:45:08 +05:30
7db6e92376
RWKV ( #222 )
...
* rwkv-init
* annotations
* Re-added docs
* make dir if not exist
* Add RWKV paper and update doc index
* add train loop
* experiment
---------
Co-authored-by: Jacob Hatef <hatef.4@buckeyemail.buckeyemail.osu.edu>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
2024-03-17 17:36:15 +05:30
285cb3735b
uodate docs
2024-03-02 14:33:53 +05:30
5ec0f70855
Fix formula typo in Relative MHA ( #242 )
...
${(\textcolor{lightgreen}{\mathbf{A + C}})}_{i,j} = Q_i^\top K_j + \textcolor{orange}{v^\top} K_j$
2024-03-02 14:19:06 +05:30
fea91b9699
Cleanup group norm Cifar experiment ( #240 )
...
This experiment is not written in the previous format in group normalization and appears longer. With modifications, this code will normalize and be shorter
2024-03-02 14:17:39 +05:30