mirror of
https://github.com/labmlai/annotated_deep_learning_paper_implementations.git
synced 2025-08-06 15:22:21 +08:00
zh
This commit is contained in:
File diff suppressed because one or more lines are too long
@ -1,10 +1,10 @@
|
||||
{
|
||||
"<h1>Generalized Advantage Estimation (GAE)</h1>\n<p>This is a <a href=\"https://pytorch.org\">PyTorch</a> implementation of paper <a href=\"https://arxiv.org/abs/1506.02438\">Generalized Advantage Estimation</a>.</p>\n<p>You can find an experiment that uses it <a href=\"experiment.html\">here</a>.</p>\n": "<h1>\u5e7f\u4e49\u4f18\u52bf\u4f30\u8ba1 (GAE)</h1>\n<p>\u8fd9\u662f\u8bba\u6587<a href=\"https://arxiv.org/abs/1506.02438\">\u5e7f\u4e49\u4f18\u52bf\u4f30\u8ba1</a>\u7684 <a href=\"https://pytorch.org\">PyTorch</a> \u5b9e\u73b0\u3002</p>\n<p>\u4f60\u53ef\u4ee5<a href=\"experiment.html\">\u5728\u8fd9\u91cc</a>\u627e\u5230\u4e00\u4e2a\u4f7f\u7528\u5b83\u7684\u5b9e\u9a8c\u3002</p>\n",
|
||||
"<h3>Calculate advantages</h3>\n<span translate=no>_^_0_^_</span><p><span translate=no>_^_1_^_</span> is high bias, low variance, whilst <span translate=no>_^_2_^_</span> is unbiased, high variance.</p>\n<p>We take a weighted average of <span translate=no>_^_3_^_</span> to balance bias and variance. This is called Generalized Advantage Estimation. <span translate=no>_^_4_^_</span> We set <span translate=no>_^_5_^_</span>, this gives clean calculation for <span translate=no>_^_6_^_</span></p>\n<span translate=no>_^_7_^_</span>": "<h3>\u8ba1\u7b97\u4f18\u52bf</h3>\n<span translate=no>_^_0_^_</span><p><span translate=no>_^_1_^_</span>\u662f\u9ad8\u504f\u5dee\uff0c\u4f4e\u65b9\u5dee\uff0c\u800c<span translate=no>_^_2_^_</span>\u65e0\u504f\u5dee\uff0c\u9ad8\u65b9\u5dee\u3002</p>\n<p>\u6211\u4eec\u91c7\u7528\u52a0\u6743\u5e73\u5747\u503c<span translate=no>_^_3_^_</span>\u6765\u5e73\u8861\u504f\u5dee\u548c\u65b9\u5dee\u3002\u8fd9\u79f0\u4e3a\u5e7f\u4e49\u4f18\u52bf\u4f30\u8ba1\u3002<span translate=no>_^_4_^_</span>\u6211\u4eec\u8bbe\u7f6e<span translate=no>_^_5_^_</span>\uff0c\u8fd9\u7ed9\u51fa\u4e86\u5e72\u51c0\u7684\u8ba1\u7b97<span translate=no>_^_6_^_</span></p>\n<span translate=no>_^_7_^_</span>",
|
||||
"<p> </p>\n": "<p> </p>\n",
|
||||
"<p><span translate=no>_^_0_^_</span> </p>\n": "<p><span translate=no>_^_0_^_</span></p>\n",
|
||||
"<p>advantages table </p>\n": "<p>\u4f18\u52bf\u8868</p>\n",
|
||||
"<p>mask if episode completed after step <span translate=no>_^_0_^_</span> </p>\n": "<p>\u5982\u679c\u5267\u96c6\u5728\u6b65\u9aa4\u4e4b\u540e\u5b8c\u6210\uff0c\u8bf7\u63a9\u76d6<span translate=no>_^_0_^_</span></p>\n",
|
||||
"<p>note that we are collecting in reverse order. <em>My initial code was appending to a list and I forgot to reverse it later. It took me around 4 to 5 hours to find the bug. The performance of the model was improving slightly during initial runs, probably because the samples are similar.</em> </p>\n": "<p>\u8bf7\u6ce8\u610f\uff0c\u6211\u4eec\u6b63\u5728\u6309\u76f8\u53cd\u7684\u987a\u5e8f\u6536\u96c6\u3002<em>\u6211\u6700\u521d\u7684\u4ee3\u7801\u88ab\u8ffd\u52a0\u5230\u4e00\u4e2a\u5217\u8868\u4e2d\uff0c\u540e\u6765\u6211\u5fd8\u8bb0\u53cd\u8f6c\u5b83\u4e86\u3002\u6211\u82b1\u4e86\u5927\u7ea6 4 \u5230 5 \u4e2a\u5c0f\u65f6\u624d\u53d1\u73b0 bug\u3002\u5728\u521d\u59cb\u8fd0\u884c\u671f\u95f4\uff0c\u8be5\u6a21\u578b\u7684\u6027\u80fd\u7565\u6709\u6539\u5584\uff0c\u8fd9\u53ef\u80fd\u662f\u56e0\u4e3a\u6837\u672c\u76f8\u4f3c\u3002</em></p>\n",
|
||||
"A PyTorch implementation/tutorial of Generalized Advantage Estimation (GAE).": "\u5e7f\u4e49\u4f18\u52bf\u4f30\u8ba1\uff08GAE\uff09\u7684 PyTorch \u5b9e\u73b0/\u6559\u7a0b\u3002",
|
||||
"Generalized Advantage Estimation (GAE)": "\u5e7f\u4e49\u4f18\u52bf\u4f30\u8ba1 (GAE)"
|
||||
}
|
Reference in New Issue
Block a user