mirror of
https://github.com/labmlai/annotated_deep_learning_paper_implementations.git
synced 2025-08-16 10:51:23 +08:00
paper url fix
This commit is contained in:
@ -1,5 +1,5 @@
|
||||
{
|
||||
"<h1>Nucleus Sampling</h1>\n<p>This is an implementation of nucleus sampling, introduced in the paper <a href=\"https://papers.labml.ai/paper/1904.09751\">The Curious Case of Neural Text Degeneration</a>.</p>\n<p>The paper discusses the problems with other sampling methods such as Beam Search, <a href=\"temperature.html\">Pure sampling</a>, <a href=\"temperature.html\">Temperature sampling</a>, and <a href=\"top_k.html\">Top-k sampling</a>. The paper introduces the idea of nucleus sampling, which practically performs better than other sampling methods for text generation.</p>\n<p>Nucleus sampling first picks a subset of the vocabulary <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is smallest set of tokens such that</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>That is, we pick the highest probable tokens until the sum of their probabilities is less that <span translate=no>_^_3_^_</span>.</p>\n<p>Then we sample from the selected tokens.</p>\n<p>Here's an <a href=\"experiment.html\">experiment</a> that uses these sampling techniques.</p>\n": "<h1>\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</h1>\n<p>\u3053\u308c\u306f\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306e\u5b9f\u88c5\u3067\u3001\u8ad6\u6587\u300c<a href=\"https://papers.labml.ai/paper/1904.09751\">\u795e\u7d4c\u30c6\u30ad\u30b9\u30c8\u5909\u6027\u306e\u5947\u5999\u306a\u4e8b\u4f8b</a>\u300d\u3067\u7d39\u4ecb\u3055\u308c\u3066\u3044\u307e\u3059\u3002</p>\n<p>\u3053\u306e\u8ad6\u6587\u3067\u306f\u3001\u30d3\u30fc\u30e0\u30b5\u30fc\u30c1\u3001<a href=\"temperature.html\">\u30d4\u30e5\u30a2\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3001\u6e29\u5ea6\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a><a href=\"temperature.html\">\u3001<a href=\"top_k.html\">TOP-K\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306a\u3069\u306e\u4ed6\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u65b9\u6cd5\u306e\u554f\u984c\u306b\u3064\u3044\u3066\u8aac\u660e\u3057\u307e\u3059</a></a>\u3002\u3053\u306e\u8ad6\u6587\u3067\u306f\u3001\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306e\u30a2\u30a4\u30c7\u30a2\u3092\u7d39\u4ecb\u3057\u3066\u3044\u307e\u3059\u3002\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306f\u3001\u30c6\u30ad\u30b9\u30c8\u751f\u6210\u306b\u304a\u3044\u3066\u4ed6\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u65b9\u6cd5\u3088\u308a\u3082\u5b9f\u8cea\u7684\u306b\u512a\u308c\u3066\u3044\u307e\u3059</p>\u3002\n<p>Nucleus \u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3067\u306f\u3001\u6700\u521d\u306b\u30dc\u30ad\u30e3\u30d6\u30e9\u30ea\u306e\u30b5\u30d6\u30bb\u30c3\u30c8\u3092\u9078\u629e\u3057\u307e\u3059\u3002\u3053\u3053\u3067<span translate=no>_^_0_^_</span>\u3001<span translate=no>_^_1_^_</span>\u306f\u6b21\u306e\u3088\u3046\u306a\u30c8\u30fc\u30af\u30f3\u306e\u6700\u5c0f\u30bb\u30c3\u30c8\u3092\u9078\u629e\u3057\u307e\u3059\u3002</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>\u3064\u307e\u308a\u3001\u78ba\u7387\u306e\u5408\u8a08\u304c\u305d\u308c\u3088\u308a\u5c0f\u3055\u304f\u306a\u308b\u307e\u3067\u3001\u6700\u3082\u53ef\u80fd\u6027\u306e\u9ad8\u3044\u30c8\u30fc\u30af\u30f3\u3092\u9078\u629e\u3057\u307e\u3059\u3002<span translate=no>_^_3_^_</span></p>\n<p>\u6b21\u306b\u3001\u9078\u629e\u3057\u305f\u30c8\u30fc\u30af\u30f3\u304b\u3089\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3057\u307e\u3059\u3002</p>\n<p>\u3053\u308c\u306f\u3001<a href=\"experiment.html\">\u3053\u308c\u3089\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u624b\u6cd5\u3092\u4f7f\u7528\u3057\u305f\u5b9f\u9a13\u3067\u3059</a>\u3002</p>\n",
|
||||
"<h1>Nucleus Sampling</h1>\n<p>This is an implementation of nucleus sampling, introduced in the paper <a href=\"https://arxiv.org/abs/1904.09751\">The Curious Case of Neural Text Degeneration</a>.</p>\n<p>The paper discusses the problems with other sampling methods such as Beam Search, <a href=\"temperature.html\">Pure sampling</a>, <a href=\"temperature.html\">Temperature sampling</a>, and <a href=\"top_k.html\">Top-k sampling</a>. The paper introduces the idea of nucleus sampling, which practically performs better than other sampling methods for text generation.</p>\n<p>Nucleus sampling first picks a subset of the vocabulary <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is smallest set of tokens such that</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>That is, we pick the highest probable tokens until the sum of their probabilities is less that <span translate=no>_^_3_^_</span>.</p>\n<p>Then we sample from the selected tokens.</p>\n<p>Here's an <a href=\"experiment.html\">experiment</a> that uses these sampling techniques.</p>\n": "<h1>\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</h1>\n<p>\u3053\u308c\u306f\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306e\u5b9f\u88c5\u3067\u3001\u8ad6\u6587\u300c<a href=\"https://arxiv.org/abs/1904.09751\">\u795e\u7d4c\u30c6\u30ad\u30b9\u30c8\u5909\u6027\u306e\u5947\u5999\u306a\u4e8b\u4f8b</a>\u300d\u3067\u7d39\u4ecb\u3055\u308c\u3066\u3044\u307e\u3059\u3002</p>\n<p>\u3053\u306e\u8ad6\u6587\u3067\u306f\u3001\u30d3\u30fc\u30e0\u30b5\u30fc\u30c1\u3001<a href=\"temperature.html\">\u30d4\u30e5\u30a2\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3001\u6e29\u5ea6\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</a><a href=\"temperature.html\">\u3001<a href=\"top_k.html\">TOP-K\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306a\u3069\u306e\u4ed6\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u65b9\u6cd5\u306e\u554f\u984c\u306b\u3064\u3044\u3066\u8aac\u660e\u3057\u307e\u3059</a></a>\u3002\u3053\u306e\u8ad6\u6587\u3067\u306f\u3001\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306e\u30a2\u30a4\u30c7\u30a2\u3092\u7d39\u4ecb\u3057\u3066\u3044\u307e\u3059\u3002\u6838\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306f\u3001\u30c6\u30ad\u30b9\u30c8\u751f\u6210\u306b\u304a\u3044\u3066\u4ed6\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u65b9\u6cd5\u3088\u308a\u3082\u5b9f\u8cea\u7684\u306b\u512a\u308c\u3066\u3044\u307e\u3059</p>\u3002\n<p>Nucleus \u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3067\u306f\u3001\u6700\u521d\u306b\u30dc\u30ad\u30e3\u30d6\u30e9\u30ea\u306e\u30b5\u30d6\u30bb\u30c3\u30c8\u3092\u9078\u629e\u3057\u307e\u3059\u3002\u3053\u3053\u3067<span translate=no>_^_0_^_</span>\u3001<span translate=no>_^_1_^_</span>\u306f\u6b21\u306e\u3088\u3046\u306a\u30c8\u30fc\u30af\u30f3\u306e\u6700\u5c0f\u30bb\u30c3\u30c8\u3092\u9078\u629e\u3057\u307e\u3059\u3002</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>\u3064\u307e\u308a\u3001\u78ba\u7387\u306e\u5408\u8a08\u304c\u305d\u308c\u3088\u308a\u5c0f\u3055\u304f\u306a\u308b\u307e\u3067\u3001\u6700\u3082\u53ef\u80fd\u6027\u306e\u9ad8\u3044\u30c8\u30fc\u30af\u30f3\u3092\u9078\u629e\u3057\u307e\u3059\u3002<span translate=no>_^_3_^_</span></p>\n<p>\u6b21\u306b\u3001\u9078\u629e\u3057\u305f\u30c8\u30fc\u30af\u30f3\u304b\u3089\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u3057\u307e\u3059\u3002</p>\n<p>\u3053\u308c\u306f\u3001<a href=\"experiment.html\">\u3053\u308c\u3089\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u624b\u6cd5\u3092\u4f7f\u7528\u3057\u305f\u5b9f\u9a13\u3067\u3059</a>\u3002</p>\n",
|
||||
"<h2>Nucleus Sampler</h2>\n": "<h2>\u6838\u30b5\u30f3\u30d7\u30e9\u30fc</h2>\n",
|
||||
"<p> </p>\n": "<p></p>\n",
|
||||
"<p> Sample from logits with Nucleus Sampling</p>\n": "<p>Nucleus \u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u306b\u3088\u308b\u30ed\u30b8\u30c3\u30c8\u304b\u3089\u306e\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0</p>\n",
|
||||
|
@ -1,5 +1,5 @@
|
||||
{
|
||||
"<h1>Nucleus Sampling</h1>\n<p>This is an implementation of nucleus sampling, introduced in the paper <a href=\"https://papers.labml.ai/paper/1904.09751\">The Curious Case of Neural Text Degeneration</a>.</p>\n<p>The paper discusses the problems with other sampling methods such as Beam Search, <a href=\"temperature.html\">Pure sampling</a>, <a href=\"temperature.html\">Temperature sampling</a>, and <a href=\"top_k.html\">Top-k sampling</a>. The paper introduces the idea of nucleus sampling, which practically performs better than other sampling methods for text generation.</p>\n<p>Nucleus sampling first picks a subset of the vocabulary <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is smallest set of tokens such that</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>That is, we pick the highest probable tokens until the sum of their probabilities is less that <span translate=no>_^_3_^_</span>.</p>\n<p>Then we sample from the selected tokens.</p>\n<p>Here's an <a href=\"experiment.html\">experiment</a> that uses these sampling techniques.</p>\n": "<h1>\u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a\u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8</h1>\n<p>\u0db8\u0dd9\u0dba\u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8\u0dca \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf\u0dad\u0dca\u0db8\u0d9a \u0d9a\u0dd2\u0dbb\u0dd3\u0db8\u0d9a\u0dca \u0dc0\u0db1 \u0d85\u0dad\u0dbb \u0d91\u0dba \u0d9a\u0da9\u0daf\u0dcf\u0dc3\u0dd2 \u0dc0\u0dbd\u0dd2\u0db1\u0dca \u0dc4\u0db3\u0dd4\u0db1\u0dca\u0dc0\u0dcf <a href=\"https://papers.labml.ai/paper/1904.09751\">\u0daf\u0dd3 \u0d87\u0dad \u0dc3\u0dca\u0db1\u0dcf\u0dba\u0dd4 \u0db4\u0dd9\u0dc5 \u0db4\u0dbb\u0dd2\u0dc4\u0dcf\u0db1\u0dd2\u0dba \u0db4\u0dd2\u0dc5\u0dd2\u0db6\u0db3 \u0d9a\u0dd4\u0dad\u0dd4\u0dc4\u0dbd\u0dba</a>. </p>\n<p>\u0d9a\u0daf\u0db8\u0dca\u0db6\u0dc3\u0dd9\u0dc0\u0dd3\u0db8, <a href=\"temperature.html\">\u0db4\u0dd2\u0dbb\u0dd2\u0dc3\u0dd2\u0daf\u0dd4 \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8, \u0d8b\u0dc2\u0dca\u0dab\u0dad\u0dca\u0dc0 \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8 \u0dc3\u0dc4 <a href=\"top_k.html\">\u0d89\u0dc4\u0dc5 \u0d9a\u0dda</a>\u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8</a><a href=\"temperature.html\">\u0dc0\u0dd0\u0db1\u0dd2 \u0dc0\u0dd9\u0db1\u0dad\u0dca \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd2</a>\u0d9a\u0dca\u0dbb\u0db8\u0dc0\u0dbd \u0d87\u0dad\u0dd2 \u0d9c\u0dd0\u0da7\u0dc5\u0dd4 \u0db4\u0dd2\u0dc5\u0dd2\u0db6\u0db3\u0dc0 \u0d9a\u0da9\u0daf\u0dcf\u0dc3\u0dd2 \u0dc3\u0dcf\u0d9a\u0da0\u0dca\u0da1\u0dcf \u0d9a\u0dbb\u0dba\u0dd2. \u0db8\u0dd9\u0db8 \u0db4\u0dad\u0dca\u0dbb\u0dd2\u0d9a\u0dcf\u0dc0 \u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8\u0dca \u0db4\u0dd2\u0dc5\u0dd2\u0db6\u0db3 \u0d85\u0daf\u0dc4\u0dc3 \u0dc4\u0db3\u0dd4\u0db1\u0dca\u0dc0\u0dcf \u0daf\u0dd9\u0dba\u0dd2, \u0d91\u0dba \u0db4\u0dd9\u0dc5 \u0d8b\u0dad\u0dca\u0db4\u0dcf\u0daf\u0db1\u0dba \u0dc3\u0db3\u0dc4\u0dcf \u0dc0\u0dd9\u0db1\u0dad\u0dca \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd2 \u0d9a\u0dca\u0dbb\u0db8\u0dc0\u0dbd\u0da7 \u0dc0\u0da9\u0dcf \u0db4\u0dca\u0dbb\u0dcf\u0dba\u0ddd\u0d9c\u0dd2\u0d9a\u0dc0 \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf \u0d9a\u0dbb\u0dba\u0dd2. </p>\n<p>\u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a\u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8 \u0db4\u0dc5\u0db8\u0dd4\u0dc0 \u0dc0\u0da0\u0db1 \u0db8\u0dcf\u0dbd\u0dcf\u0dc0\u0dda \u0d8b\u0db4 \u0d9a\u0dd4\u0dbd\u0d9a\u0dba\u0d9a\u0dca \u0dad\u0ddd\u0dbb\u0dcf \u0d9c\u0db1\u0dd3 <span translate=no>_^_0_^_</span>, \u0d91\u0dc4\u0dd2\u0daf\u0dd3 <span translate=no>_^_1_^_</span> \u0d9a\u0dd4\u0da9\u0dcf\u0db8 \u0da7\u0ddd\u0d9a\u0db1 \u0d9a\u0da7\u0dca\u0da7\u0dbd\u0dba\u0d9a\u0dca</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>\u0d91\u0db1\u0db8\u0dca, \u0d92\u0dc0\u0dcf\u0dba\u0dda \u0dc3\u0db8\u0dca\u0db7\u0dcf\u0dc0\u0dd2\u0dad\u0dcf\u0dc0\u0db1\u0dca\u0d9c\u0dda \u0d91\u0d9a\u0dad\u0dd4\u0dc0 \u0d85\u0da9\u0dd4 \u0dc0\u0db1 \u0dad\u0dd9\u0d9a\u0dca \u0d85\u0db4\u0dd2 \u0d89\u0dc4\u0dc5\u0db8 \u0dc3\u0db8\u0dca\u0db7\u0dcf\u0dc0\u0dd2\u0dad\u0dcf\u0dc0 \u0dc3\u0dc4\u0dd2\u0dad \u0da7\u0ddd\u0d9a\u0db1 \u0dad\u0ddd\u0dbb\u0dcf \u0d9c\u0db1\u0dd2\u0db8\u0dd4 <span translate=no>_^_3_^_</span>. </p>\n<p>\u0d89\u0db1\u0dca\u0db4\u0dc3\u0dd4\u0d85\u0db4\u0dd2 \u0dad\u0ddd\u0dbb\u0dcf\u0d9c\u0dad\u0dca \u0da7\u0ddd\u0d9a\u0db1 \u0dc0\u0dbd\u0dd2\u0db1\u0dca \u0dc3\u0dcf\u0db8\u0dca\u0db4\u0dbd \u0dbd\u0db6\u0dcf \u0d9c\u0db1\u0dd2\u0db8\u0dd4. </p>\n<p>\u0db8\u0dd9\u0db1\u0dca\u0db1\u0db8\u0dd9\u0db8 \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd2 \u0dc1\u0dd2\u0dbd\u0dca\u0db4\u0dd3\u0dba \u0d9a\u0dca\u0dbb\u0db8 \u0db7\u0dcf\u0dc0\u0dd2\u0dad\u0dcf \u0d9a\u0dbb\u0db1 <a href=\"experiment.html\">\u0d85\u0dad\u0dca\u0dc4\u0daf\u0dcf \u0db6\u0dd0\u0dbd\u0dd3\u0db8\u0d9a\u0dca</a> . </p>\n",
|
||||
"<h1>Nucleus Sampling</h1>\n<p>This is an implementation of nucleus sampling, introduced in the paper <a href=\"https://arxiv.org/abs/1904.09751\">The Curious Case of Neural Text Degeneration</a>.</p>\n<p>The paper discusses the problems with other sampling methods such as Beam Search, <a href=\"temperature.html\">Pure sampling</a>, <a href=\"temperature.html\">Temperature sampling</a>, and <a href=\"top_k.html\">Top-k sampling</a>. The paper introduces the idea of nucleus sampling, which practically performs better than other sampling methods for text generation.</p>\n<p>Nucleus sampling first picks a subset of the vocabulary <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is smallest set of tokens such that</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>That is, we pick the highest probable tokens until the sum of their probabilities is less that <span translate=no>_^_3_^_</span>.</p>\n<p>Then we sample from the selected tokens.</p>\n<p>Here's an <a href=\"experiment.html\">experiment</a> that uses these sampling techniques.</p>\n": "<h1>\u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a\u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8</h1>\n<p>\u0db8\u0dd9\u0dba\u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8\u0dca \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf\u0dad\u0dca\u0db8\u0d9a \u0d9a\u0dd2\u0dbb\u0dd3\u0db8\u0d9a\u0dca \u0dc0\u0db1 \u0d85\u0dad\u0dbb \u0d91\u0dba \u0d9a\u0da9\u0daf\u0dcf\u0dc3\u0dd2 \u0dc0\u0dbd\u0dd2\u0db1\u0dca \u0dc4\u0db3\u0dd4\u0db1\u0dca\u0dc0\u0dcf <a href=\"https://arxiv.org/abs/1904.09751\">\u0daf\u0dd3 \u0d87\u0dad \u0dc3\u0dca\u0db1\u0dcf\u0dba\u0dd4 \u0db4\u0dd9\u0dc5 \u0db4\u0dbb\u0dd2\u0dc4\u0dcf\u0db1\u0dd2\u0dba \u0db4\u0dd2\u0dc5\u0dd2\u0db6\u0db3 \u0d9a\u0dd4\u0dad\u0dd4\u0dc4\u0dbd\u0dba</a>. </p>\n<p>\u0d9a\u0daf\u0db8\u0dca\u0db6\u0dc3\u0dd9\u0dc0\u0dd3\u0db8, <a href=\"temperature.html\">\u0db4\u0dd2\u0dbb\u0dd2\u0dc3\u0dd2\u0daf\u0dd4 \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8, \u0d8b\u0dc2\u0dca\u0dab\u0dad\u0dca\u0dc0 \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8 \u0dc3\u0dc4 <a href=\"top_k.html\">\u0d89\u0dc4\u0dc5 \u0d9a\u0dda</a>\u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8</a><a href=\"temperature.html\">\u0dc0\u0dd0\u0db1\u0dd2 \u0dc0\u0dd9\u0db1\u0dad\u0dca \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd2</a>\u0d9a\u0dca\u0dbb\u0db8\u0dc0\u0dbd \u0d87\u0dad\u0dd2 \u0d9c\u0dd0\u0da7\u0dc5\u0dd4 \u0db4\u0dd2\u0dc5\u0dd2\u0db6\u0db3\u0dc0 \u0d9a\u0da9\u0daf\u0dcf\u0dc3\u0dd2 \u0dc3\u0dcf\u0d9a\u0da0\u0dca\u0da1\u0dcf \u0d9a\u0dbb\u0dba\u0dd2. \u0db8\u0dd9\u0db8 \u0db4\u0dad\u0dca\u0dbb\u0dd2\u0d9a\u0dcf\u0dc0 \u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8\u0dca \u0db4\u0dd2\u0dc5\u0dd2\u0db6\u0db3 \u0d85\u0daf\u0dc4\u0dc3 \u0dc4\u0db3\u0dd4\u0db1\u0dca\u0dc0\u0dcf \u0daf\u0dd9\u0dba\u0dd2, \u0d91\u0dba \u0db4\u0dd9\u0dc5 \u0d8b\u0dad\u0dca\u0db4\u0dcf\u0daf\u0db1\u0dba \u0dc3\u0db3\u0dc4\u0dcf \u0dc0\u0dd9\u0db1\u0dad\u0dca \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd2 \u0d9a\u0dca\u0dbb\u0db8\u0dc0\u0dbd\u0da7 \u0dc0\u0da9\u0dcf \u0db4\u0dca\u0dbb\u0dcf\u0dba\u0ddd\u0d9c\u0dd2\u0d9a\u0dc0 \u0d9a\u0dca\u0dbb\u0dd2\u0dba\u0dcf \u0d9a\u0dbb\u0dba\u0dd2. </p>\n<p>\u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a\u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8 \u0db4\u0dc5\u0db8\u0dd4\u0dc0 \u0dc0\u0da0\u0db1 \u0db8\u0dcf\u0dbd\u0dcf\u0dc0\u0dda \u0d8b\u0db4 \u0d9a\u0dd4\u0dbd\u0d9a\u0dba\u0d9a\u0dca \u0dad\u0ddd\u0dbb\u0dcf \u0d9c\u0db1\u0dd3 <span translate=no>_^_0_^_</span>, \u0d91\u0dc4\u0dd2\u0daf\u0dd3 <span translate=no>_^_1_^_</span> \u0d9a\u0dd4\u0da9\u0dcf\u0db8 \u0da7\u0ddd\u0d9a\u0db1 \u0d9a\u0da7\u0dca\u0da7\u0dbd\u0dba\u0d9a\u0dca</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>\u0d91\u0db1\u0db8\u0dca, \u0d92\u0dc0\u0dcf\u0dba\u0dda \u0dc3\u0db8\u0dca\u0db7\u0dcf\u0dc0\u0dd2\u0dad\u0dcf\u0dc0\u0db1\u0dca\u0d9c\u0dda \u0d91\u0d9a\u0dad\u0dd4\u0dc0 \u0d85\u0da9\u0dd4 \u0dc0\u0db1 \u0dad\u0dd9\u0d9a\u0dca \u0d85\u0db4\u0dd2 \u0d89\u0dc4\u0dc5\u0db8 \u0dc3\u0db8\u0dca\u0db7\u0dcf\u0dc0\u0dd2\u0dad\u0dcf\u0dc0 \u0dc3\u0dc4\u0dd2\u0dad \u0da7\u0ddd\u0d9a\u0db1 \u0dad\u0ddd\u0dbb\u0dcf \u0d9c\u0db1\u0dd2\u0db8\u0dd4 <span translate=no>_^_3_^_</span>. </p>\n<p>\u0d89\u0db1\u0dca\u0db4\u0dc3\u0dd4\u0d85\u0db4\u0dd2 \u0dad\u0ddd\u0dbb\u0dcf\u0d9c\u0dad\u0dca \u0da7\u0ddd\u0d9a\u0db1 \u0dc0\u0dbd\u0dd2\u0db1\u0dca \u0dc3\u0dcf\u0db8\u0dca\u0db4\u0dbd \u0dbd\u0db6\u0dcf \u0d9c\u0db1\u0dd2\u0db8\u0dd4. </p>\n<p>\u0db8\u0dd9\u0db1\u0dca\u0db1\u0db8\u0dd9\u0db8 \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd2 \u0dc1\u0dd2\u0dbd\u0dca\u0db4\u0dd3\u0dba \u0d9a\u0dca\u0dbb\u0db8 \u0db7\u0dcf\u0dc0\u0dd2\u0dad\u0dcf \u0d9a\u0dbb\u0db1 <a href=\"experiment.html\">\u0d85\u0dad\u0dca\u0dc4\u0daf\u0dcf \u0db6\u0dd0\u0dbd\u0dd3\u0db8\u0d9a\u0dca</a> . </p>\n",
|
||||
"<h2>Nucleus Sampler</h2>\n": "<h2>\u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a\u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd2</h2>\n",
|
||||
"<p> </p>\n": "<p> </p>\n",
|
||||
"<p> Sample from logits with Nucleus Sampling</p>\n": "<p> \u0db1\u0dca\u0dba\u0dc2\u0dca\u0da7\u0dd2\u0d9a\u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd3\u0db8 \u0dc3\u0db8\u0d9f \u0db4\u0dd2\u0dc0\u0dd2\u0dc3\u0dd4\u0db8\u0dca \u0dc0\u0dbd\u0dd2\u0db1\u0dca \u0db1\u0dd2\u0dba\u0dd0\u0daf\u0dd2\u0dba</p>\n",
|
||||
|
@ -1,5 +1,5 @@
|
||||
{
|
||||
"<h1>Nucleus Sampling</h1>\n<p>This is an implementation of nucleus sampling, introduced in the paper <a href=\"https://papers.labml.ai/paper/1904.09751\">The Curious Case of Neural Text Degeneration</a>.</p>\n<p>The paper discusses the problems with other sampling methods such as Beam Search, <a href=\"temperature.html\">Pure sampling</a>, <a href=\"temperature.html\">Temperature sampling</a>, and <a href=\"top_k.html\">Top-k sampling</a>. The paper introduces the idea of nucleus sampling, which practically performs better than other sampling methods for text generation.</p>\n<p>Nucleus sampling first picks a subset of the vocabulary <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is smallest set of tokens such that</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>That is, we pick the highest probable tokens until the sum of their probabilities is less that <span translate=no>_^_3_^_</span>.</p>\n<p>Then we sample from the selected tokens.</p>\n<p>Here's an <a href=\"experiment.html\">experiment</a> that uses these sampling techniques.</p>\n": "<h1>\u539f\u5b50\u6838\u91c7\u6837</h1>\n<p>\u8fd9\u662f\u539f\u5b50\u6838\u91c7\u6837\u7684\u4e00\u79cd\u5b9e\u73b0\uff0c\u5728\u8bba\u6587<a href=\"https://papers.labml.ai/paper/1904.09751\">\u300a\u795e\u7ecf\u6587\u672c\u53d8\u6027\u7684\u597d\u5947\u6848\u4f8b\u300b</a>\u4e2d\u8fdb\u884c\u4e86\u4ecb\u7ecd\u3002</p>\n<p>\u672c\u6587\u8ba8\u8bba\u4e86\u5176\u4ed6\u91c7\u6837\u65b9\u6cd5\uff08\u4f8b\u5982\u5149\u675f\u641c\u7d22\u3001<a href=\"temperature.html\">\u7eaf\u91c7\u6837\u3001<a href=\"temperature.html\">\u6e29\u5ea6</a>\u91c7\u6837</a>\u548cT <a href=\"top_k.html\">op-K\u91c7\u6837</a>\uff09\u5b58\u5728\u7684\u95ee\u9898\u3002\u672c\u6587\u4ecb\u7ecd\u4e86\u539f\u5b50\u6838\u91c7\u6837\u7684\u6982\u5ff5\uff0c\u5728\u6587\u672c\u751f\u6210\u65b9\u9762\uff0c\u6838\u91c7\u6837\u7684\u6548\u679c\u5b9e\u9645\u4e0a\u6bd4\u5176\u4ed6\u91c7\u6837\u65b9\u6cd5\u8981\u597d\u3002</p>\n<p>Nucleus \u91c7\u6837\u9996\u5148\u9009\u62e9\u8bcd\u6c47\u7684\u4e00\u4e2a\u5b50\u96c6<span translate=no>_^_0_^_</span>\uff0c\u5176\u4e2d<span translate=no>_^_1_^_</span>\u662f\u6700\u5c0f\u7684\u4ee4\u724c\u96c6\u5408</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>\u4e5f\u5c31\u662f\u8bf4\uff0c\u6211\u4eec\u9009\u62e9\u53ef\u80fd\u6027\u6700\u9ad8\u7684\u4ee3\u5e01\uff0c\u76f4\u5230\u5b83\u4eec\u7684\u6982\u7387\u603b\u548c\u5c0f\u4e8e\u8be5\u503c\u4e3a\u6b62<span translate=no>_^_3_^_</span>\u3002</p>\n<p>\u7136\u540e\u6211\u4eec\u4ece\u9009\u5b9a\u7684\u4ee4\u724c\u4e2d\u62bd\u6837\u3002</p>\n<p>\u8fd9\u662f\u4e00\u4e2a\u4f7f\u7528\u8fd9\u4e9b\u91c7\u6837\u6280\u672f\u7684<a href=\"experiment.html\">\u5b9e\u9a8c</a>\u3002</p>\n",
|
||||
"<h1>Nucleus Sampling</h1>\n<p>This is an implementation of nucleus sampling, introduced in the paper <a href=\"https://arxiv.org/abs/1904.09751\">The Curious Case of Neural Text Degeneration</a>.</p>\n<p>The paper discusses the problems with other sampling methods such as Beam Search, <a href=\"temperature.html\">Pure sampling</a>, <a href=\"temperature.html\">Temperature sampling</a>, and <a href=\"top_k.html\">Top-k sampling</a>. The paper introduces the idea of nucleus sampling, which practically performs better than other sampling methods for text generation.</p>\n<p>Nucleus sampling first picks a subset of the vocabulary <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is smallest set of tokens such that</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>That is, we pick the highest probable tokens until the sum of their probabilities is less that <span translate=no>_^_3_^_</span>.</p>\n<p>Then we sample from the selected tokens.</p>\n<p>Here's an <a href=\"experiment.html\">experiment</a> that uses these sampling techniques.</p>\n": "<h1>\u539f\u5b50\u6838\u91c7\u6837</h1>\n<p>\u8fd9\u662f\u539f\u5b50\u6838\u91c7\u6837\u7684\u4e00\u79cd\u5b9e\u73b0\uff0c\u5728\u8bba\u6587<a href=\"https://arxiv.org/abs/1904.09751\">\u300a\u795e\u7ecf\u6587\u672c\u53d8\u6027\u7684\u597d\u5947\u6848\u4f8b\u300b</a>\u4e2d\u8fdb\u884c\u4e86\u4ecb\u7ecd\u3002</p>\n<p>\u672c\u6587\u8ba8\u8bba\u4e86\u5176\u4ed6\u91c7\u6837\u65b9\u6cd5\uff08\u4f8b\u5982\u5149\u675f\u641c\u7d22\u3001<a href=\"temperature.html\">\u7eaf\u91c7\u6837\u3001<a href=\"temperature.html\">\u6e29\u5ea6</a>\u91c7\u6837</a>\u548cT <a href=\"top_k.html\">op-K\u91c7\u6837</a>\uff09\u5b58\u5728\u7684\u95ee\u9898\u3002\u672c\u6587\u4ecb\u7ecd\u4e86\u539f\u5b50\u6838\u91c7\u6837\u7684\u6982\u5ff5\uff0c\u5728\u6587\u672c\u751f\u6210\u65b9\u9762\uff0c\u6838\u91c7\u6837\u7684\u6548\u679c\u5b9e\u9645\u4e0a\u6bd4\u5176\u4ed6\u91c7\u6837\u65b9\u6cd5\u8981\u597d\u3002</p>\n<p>Nucleus \u91c7\u6837\u9996\u5148\u9009\u62e9\u8bcd\u6c47\u7684\u4e00\u4e2a\u5b50\u96c6<span translate=no>_^_0_^_</span>\uff0c\u5176\u4e2d<span translate=no>_^_1_^_</span>\u662f\u6700\u5c0f\u7684\u4ee4\u724c\u96c6\u5408</p>\n<p><span translate=no>_^_2_^_</span></p>\n<p>\u4e5f\u5c31\u662f\u8bf4\uff0c\u6211\u4eec\u9009\u62e9\u53ef\u80fd\u6027\u6700\u9ad8\u7684\u4ee3\u5e01\uff0c\u76f4\u5230\u5b83\u4eec\u7684\u6982\u7387\u603b\u548c\u5c0f\u4e8e\u8be5\u503c\u4e3a\u6b62<span translate=no>_^_3_^_</span>\u3002</p>\n<p>\u7136\u540e\u6211\u4eec\u4ece\u9009\u5b9a\u7684\u4ee4\u724c\u4e2d\u62bd\u6837\u3002</p>\n<p>\u8fd9\u662f\u4e00\u4e2a\u4f7f\u7528\u8fd9\u4e9b\u91c7\u6837\u6280\u672f\u7684<a href=\"experiment.html\">\u5b9e\u9a8c</a>\u3002</p>\n",
|
||||
"<h2>Nucleus Sampler</h2>\n": "<h2>Nucleus \u91c7\u6837\u5668</h2>\n",
|
||||
"<p> </p>\n": "<p></p>\n",
|
||||
"<p> Sample from logits with Nucleus Sampling</p>\n": "<p>\u4f7f\u7528 Nucleus \u91c7\u6837\u4ece logits \u4e2d\u63d0\u53d6\u6837\u672c</p>\n",
|
||||
|
Reference in New Issue
Block a user