paper url fix

This commit is contained in:
Varuna Jayasiri
2024-06-21 19:01:16 +05:30
parent 09d09379c2
commit f00ba4a70f
318 changed files with 378 additions and 378 deletions

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,5 +1,5 @@
{
"<h1>k-Nearest Neighbor Language Models</h1>\n<p>This is a <a href=\"https://pytorch.org\">PyTorch</a> implementation of the paper <a href=\"https://papers.labml.ai/paper/1911.00172\">Generalization through Memorization: Nearest Neighbor Language Models</a>. It uses k-nearest neighbors to improve perplexity of autoregressive transformer models.</p>\n<p>An autoregressive language model estimates <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is the token at step <span translate=no>_^_2_^_</span> and <span translate=no>_^_3_^_</span> is the context, <span translate=no>_^_4_^_</span>.</p>\n<p>This paper, improves <span translate=no>_^_5_^_</span> using a k-nearest neighbor search on key-value pairs <span translate=no>_^_6_^_</span>, with search key <span translate=no>_^_7_^_</span>. Here <span translate=no>_^_8_^_</span> is an embedding of the context <span translate=no>_^_9_^_</span>. The paper (and this implementation) uses the <strong>input to the feed-forward layer of the final layer of the transformer</strong> as <span translate=no>_^_10_^_</span>.</p>\n<p>We use <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a> to index <span translate=no>_^_11_^_</span>.</p>\n<h3>Implementation</h3>\n<p>So to run <span translate=no>_^_12_^_</span>NN-LM we need to:</p>\n<ul><li><a href=\"train_model.html\">Train a transformer model</a> </li>\n<li><a href=\"build_index.html\">Build an index</a> of <span translate=no>_^_13_^_</span> </li>\n<li><a href=\"eval_knn.html\">Evaluate kNN-ML</a> using <span translate=no>_^_14_^_</span>NN seach on <span translate=no>_^_15_^_</span> with <span translate=no>_^_16_^_</span></li></ul>\n<p>This experiment uses a small dataset so that we can run this without using up a few hundred giga-bytes of disk space for the index.</p>\n<p>The official implementation of <span translate=no>_^_17_^_</span>NN-LM can be found <a href=\"https://github.com/urvashik/knnlm\">here</a>.</p>\n": "<h1>K \u8fd1\u90bb\u8bed\u8a00\u6a21\u578b</h1>\n<p>\u8fd9\u662f\u8bba\u6587\u300a<a href=\"https://papers.labml.ai/paper/1911.00172\">\u901a\u8fc7\u8bb0\u5fc6\u63a8\u5e7f\uff1a\u6700\u8fd1\u90bb\u8bed\u8a00\u6a21\u578b</a>\u300b\u7684 <a href=\"https://pytorch.org\">PyTorch</a> \u5b9e\u73b0\u3002\u5b83\u4f7f\u7528 k \u6700\u8fd1\u90bb\u6765\u6539\u5584\u81ea\u56de\u5f52\u53d8\u538b\u5668\u6a21\u578b\u7684\u56f0\u60d1\u5ea6\u3002</p>\n<p>\u81ea\u56de\u5f52\u8bed\u8a00\u6a21\u578b\u4f30\u8ba1<span translate=no>_^_0_^_</span>\uff0c\u6b65\u9aa4\u4e2d\u7684\u6807\u8bb0\u5728\u54ea\u91cc<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\uff0c<span translate=no>_^_3_^_</span>\u662f\u4e0a\u4e0b\u6587\uff0c<span translate=no>_^_4_^_</span>\u3002</p>\n<p>\u672c\u6587\u6539\u8fdb\u4e86<span translate=no>_^_5_^_</span>\u4f7f\u7528\u5e26\u641c\u7d22\u952e\u7684\u952e\u503c\u5bf9<span translate=no>_^_6_^_</span>\u4f7f\u7528 k \u6700\u8fd1\u90bb\u641c\u7d22\u7684\u529f\u80fd<span translate=no>_^_7_^_</span>\u3002<span translate=no>_^_8_^_</span>\u8fd9\u662f\u4e0a\u4e0b\u6587\u7684\u5d4c\u5165<span translate=no>_^_9_^_</span>\u3002\u672c\u6587\uff08\u4ee5\u53ca\u672c\u5b9e\u73b0\uff09\u4f7f\u7528<strong>\u53d8\u538b\u5668\u6700\u540e\u4e00\u5c42\u524d\u9988\u5c42\u7684\u8f93\u5165</strong>\u4f5c\u4e3a<span translate=no>_^_10_^_</span>\u3002</p>\n<p>\u6211\u4eec\u4f7f\u7528 <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a> \u8fdb\u884c\u7d22\u5f15<span translate=no>_^_11_^_</span>\u3002</p>\n<h3>\u5b9e\u65bd</h3>\n<p>\u56e0\u6b64\uff0c\u8981\u8fd0\u884c<span translate=no>_^_12_^_</span> NN-LM\uff0c\u6211\u4eec\u9700\u8981\uff1a</p>\n<ul><li><a href=\"train_model.html\">\u8bad\u7ec3\u53d8\u538b\u5668\u6a21\u578b</a></li>\n<li><a href=\"build_index.html\">\u5efa\u7acb\u7d22\u5f15</a><span translate=no>_^_13_^_</span></li>\n<li>\u4f7f\u7528 <a href=\"eval_knn.html\">NN \u641c\u7d22\u6765\u8bc4\u4f30 k<span translate=no>_^_14_^_</span> nn-ML</a><span translate=no>_^_15_^_</span><span translate=no>_^_16_^_</span></li></ul>\n<p>\u8fd9\u4e2a\u5b9e\u9a8c\u4f7f\u7528\u4e86\u4e00\u4e2a\u5c0f\u6570\u636e\u96c6\uff0c\u8fd9\u6837\u6211\u4eec\u5c31\u53ef\u4ee5\u5728\u4e0d\u5360\u7528\u51e0\u767e\u5343\u5146\u5b57\u8282\u7684\u7d22\u5f15\u78c1\u76d8\u7a7a\u95f4\u7684\u60c5\u51b5\u4e0b\u8fd0\u884c\u5b83\u3002</p>\n<p><span translate=no>_^_17_^_</span>NN-LM \u7684\u5b98\u65b9\u5b9e\u73b0\u53ef\u4ee5<a href=\"https://github.com/urvashik/knnlm\">\u5728\u8fd9\u91cc</a>\u627e\u5230\u3002</p>\n",
"<h1>k-Nearest Neighbor Language Models</h1>\n<p>This is a <a href=\"https://pytorch.org\">PyTorch</a> implementation of the paper <a href=\"https://arxiv.org/abs/1911.00172\">Generalization through Memorization: Nearest Neighbor Language Models</a>. It uses k-nearest neighbors to improve perplexity of autoregressive transformer models.</p>\n<p>An autoregressive language model estimates <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is the token at step <span translate=no>_^_2_^_</span> and <span translate=no>_^_3_^_</span> is the context, <span translate=no>_^_4_^_</span>.</p>\n<p>This paper, improves <span translate=no>_^_5_^_</span> using a k-nearest neighbor search on key-value pairs <span translate=no>_^_6_^_</span>, with search key <span translate=no>_^_7_^_</span>. Here <span translate=no>_^_8_^_</span> is an embedding of the context <span translate=no>_^_9_^_</span>. The paper (and this implementation) uses the <strong>input to the feed-forward layer of the final layer of the transformer</strong> as <span translate=no>_^_10_^_</span>.</p>\n<p>We use <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a> to index <span translate=no>_^_11_^_</span>.</p>\n<h3>Implementation</h3>\n<p>So to run <span translate=no>_^_12_^_</span>NN-LM we need to:</p>\n<ul><li><a href=\"train_model.html\">Train a transformer model</a> </li>\n<li><a href=\"build_index.html\">Build an index</a> of <span translate=no>_^_13_^_</span> </li>\n<li><a href=\"eval_knn.html\">Evaluate kNN-ML</a> using <span translate=no>_^_14_^_</span>NN seach on <span translate=no>_^_15_^_</span> with <span translate=no>_^_16_^_</span></li></ul>\n<p>This experiment uses a small dataset so that we can run this without using up a few hundred giga-bytes of disk space for the index.</p>\n<p>The official implementation of <span translate=no>_^_17_^_</span>NN-LM can be found <a href=\"https://github.com/urvashik/knnlm\">here</a>.</p>\n": "<h1>K \u8fd1\u90bb\u8bed\u8a00\u6a21\u578b</h1>\n<p>\u8fd9\u662f\u8bba\u6587\u300a<a href=\"https://arxiv.org/abs/1911.00172\">\u901a\u8fc7\u8bb0\u5fc6\u63a8\u5e7f\uff1a\u6700\u8fd1\u90bb\u8bed\u8a00\u6a21\u578b</a>\u300b\u7684 <a href=\"https://pytorch.org\">PyTorch</a> \u5b9e\u73b0\u3002\u5b83\u4f7f\u7528 k \u6700\u8fd1\u90bb\u6765\u6539\u5584\u81ea\u56de\u5f52\u53d8\u538b\u5668\u6a21\u578b\u7684\u56f0\u60d1\u5ea6\u3002</p>\n<p>\u81ea\u56de\u5f52\u8bed\u8a00\u6a21\u578b\u4f30\u8ba1<span translate=no>_^_0_^_</span>\uff0c\u6b65\u9aa4\u4e2d\u7684\u6807\u8bb0\u5728\u54ea\u91cc<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\uff0c<span translate=no>_^_3_^_</span>\u662f\u4e0a\u4e0b\u6587\uff0c<span translate=no>_^_4_^_</span>\u3002</p>\n<p>\u672c\u6587\u6539\u8fdb\u4e86<span translate=no>_^_5_^_</span>\u4f7f\u7528\u5e26\u641c\u7d22\u952e\u7684\u952e\u503c\u5bf9<span translate=no>_^_6_^_</span>\u4f7f\u7528 k \u6700\u8fd1\u90bb\u641c\u7d22\u7684\u529f\u80fd<span translate=no>_^_7_^_</span>\u3002<span translate=no>_^_8_^_</span>\u8fd9\u662f\u4e0a\u4e0b\u6587\u7684\u5d4c\u5165<span translate=no>_^_9_^_</span>\u3002\u672c\u6587\uff08\u4ee5\u53ca\u672c\u5b9e\u73b0\uff09\u4f7f\u7528<strong>\u53d8\u538b\u5668\u6700\u540e\u4e00\u5c42\u524d\u9988\u5c42\u7684\u8f93\u5165</strong>\u4f5c\u4e3a<span translate=no>_^_10_^_</span>\u3002</p>\n<p>\u6211\u4eec\u4f7f\u7528 <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a> \u8fdb\u884c\u7d22\u5f15<span translate=no>_^_11_^_</span>\u3002</p>\n<h3>\u5b9e\u65bd</h3>\n<p>\u56e0\u6b64\uff0c\u8981\u8fd0\u884c<span translate=no>_^_12_^_</span> NN-LM\uff0c\u6211\u4eec\u9700\u8981\uff1a</p>\n<ul><li><a href=\"train_model.html\">\u8bad\u7ec3\u53d8\u538b\u5668\u6a21\u578b</a></li>\n<li><a href=\"build_index.html\">\u5efa\u7acb\u7d22\u5f15</a><span translate=no>_^_13_^_</span></li>\n<li>\u4f7f\u7528 <a href=\"eval_knn.html\">NN \u641c\u7d22\u6765\u8bc4\u4f30 k<span translate=no>_^_14_^_</span> nn-ML</a><span translate=no>_^_15_^_</span><span translate=no>_^_16_^_</span></li></ul>\n<p>\u8fd9\u4e2a\u5b9e\u9a8c\u4f7f\u7528\u4e86\u4e00\u4e2a\u5c0f\u6570\u636e\u96c6\uff0c\u8fd9\u6837\u6211\u4eec\u5c31\u53ef\u4ee5\u5728\u4e0d\u5360\u7528\u51e0\u767e\u5343\u5146\u5b57\u8282\u7684\u7d22\u5f15\u78c1\u76d8\u7a7a\u95f4\u7684\u60c5\u51b5\u4e0b\u8fd0\u884c\u5b83\u3002</p>\n<p><span translate=no>_^_17_^_</span>NN-LM \u7684\u5b98\u65b9\u5b9e\u73b0\u53ef\u4ee5<a href=\"https://github.com/urvashik/knnlm\">\u5728\u8fd9\u91cc</a>\u627e\u5230\u3002</p>\n",
"This is a simple PyTorch implementation/tutorial of the paper Generalization through Memorization: Nearest Neighbor Language Models using FAISS. It runs a kNN model on the final transformer layer embeddings to improve the loss of transformer based language models. It's also great for domain adaptation without pre-training.": "\u8fd9\u662f\u8bba\u6587\u300a\u8bb0\u5fc6\u6cdb\u5316\uff1a\u4f7f\u7528FAISS\u7684\u6700\u8fd1\u90bb\u8bed\u8a00\u6a21\u578b\u300b\u7684\u7b80\u5355PyTorch\u5b9e\u73b0/\u6559\u7a0b\u3002\u5b83\u5728\u6700\u7ec8\u7684\u53d8\u538b\u5668\u5c42\u5d4c\u5165\u4e0a\u8fd0\u884ckNN\u6a21\u578b\uff0c\u4ee5\u6539\u5584\u57fa\u4e8e\u53d8\u538b\u5668\u7684\u8bed\u8a00\u6a21\u578b\u7684\u635f\u8017\u3002\u5b83\u4e5f\u975e\u5e38\u9002\u5408\u65e0\u9700\u9884\u5148\u8bad\u7ec3\u7684\u9886\u57df\u9002\u5e94\u3002",
"k-Nearest Neighbor Language Models": "K \u8fd1\u90bb\u8bed\u8a00\u6a21\u578b"
}