mirror of
https://github.com/labmlai/annotated_deep_learning_paper_implementations.git
synced 2025-11-02 21:40:15 +08:00
paper url fix
This commit is contained in:
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -1,5 +1,5 @@
|
||||
{
|
||||
"<h1>k-Nearest Neighbor Language Models</h1>\n<p>This is a <a href=\"https://pytorch.org\">PyTorch</a> implementation of the paper <a href=\"https://papers.labml.ai/paper/1911.00172\">Generalization through Memorization: Nearest Neighbor Language Models</a>. It uses k-nearest neighbors to improve perplexity of autoregressive transformer models.</p>\n<p>An autoregressive language model estimates <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is the token at step <span translate=no>_^_2_^_</span> and <span translate=no>_^_3_^_</span> is the context, <span translate=no>_^_4_^_</span>.</p>\n<p>This paper, improves <span translate=no>_^_5_^_</span> using a k-nearest neighbor search on key-value pairs <span translate=no>_^_6_^_</span>, with search key <span translate=no>_^_7_^_</span>. Here <span translate=no>_^_8_^_</span> is an embedding of the context <span translate=no>_^_9_^_</span>. The paper (and this implementation) uses the <strong>input to the feed-forward layer of the final layer of the transformer</strong> as <span translate=no>_^_10_^_</span>.</p>\n<p>We use <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a> to index <span translate=no>_^_11_^_</span>.</p>\n<h3>Implementation</h3>\n<p>So to run <span translate=no>_^_12_^_</span>NN-LM we need to:</p>\n<ul><li><a href=\"train_model.html\">Train a transformer model</a> </li>\n<li><a href=\"build_index.html\">Build an index</a> of <span translate=no>_^_13_^_</span> </li>\n<li><a href=\"eval_knn.html\">Evaluate kNN-ML</a> using <span translate=no>_^_14_^_</span>NN seach on <span translate=no>_^_15_^_</span> with <span translate=no>_^_16_^_</span></li></ul>\n<p>This experiment uses a small dataset so that we can run this without using up a few hundred giga-bytes of disk space for the index.</p>\n<p>The official implementation of <span translate=no>_^_17_^_</span>NN-LM can be found <a href=\"https://github.com/urvashik/knnlm\">here</a>.</p>\n": "<h1>K \u8fd1\u90bb\u8bed\u8a00\u6a21\u578b</h1>\n<p>\u8fd9\u662f\u8bba\u6587\u300a<a href=\"https://papers.labml.ai/paper/1911.00172\">\u901a\u8fc7\u8bb0\u5fc6\u63a8\u5e7f\uff1a\u6700\u8fd1\u90bb\u8bed\u8a00\u6a21\u578b</a>\u300b\u7684 <a href=\"https://pytorch.org\">PyTorch</a> \u5b9e\u73b0\u3002\u5b83\u4f7f\u7528 k \u6700\u8fd1\u90bb\u6765\u6539\u5584\u81ea\u56de\u5f52\u53d8\u538b\u5668\u6a21\u578b\u7684\u56f0\u60d1\u5ea6\u3002</p>\n<p>\u81ea\u56de\u5f52\u8bed\u8a00\u6a21\u578b\u4f30\u8ba1<span translate=no>_^_0_^_</span>\uff0c\u6b65\u9aa4\u4e2d\u7684\u6807\u8bb0\u5728\u54ea\u91cc<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\uff0c<span translate=no>_^_3_^_</span>\u662f\u4e0a\u4e0b\u6587\uff0c<span translate=no>_^_4_^_</span>\u3002</p>\n<p>\u672c\u6587\u6539\u8fdb\u4e86<span translate=no>_^_5_^_</span>\u4f7f\u7528\u5e26\u641c\u7d22\u952e\u7684\u952e\u503c\u5bf9<span translate=no>_^_6_^_</span>\u4f7f\u7528 k \u6700\u8fd1\u90bb\u641c\u7d22\u7684\u529f\u80fd<span translate=no>_^_7_^_</span>\u3002<span translate=no>_^_8_^_</span>\u8fd9\u662f\u4e0a\u4e0b\u6587\u7684\u5d4c\u5165<span translate=no>_^_9_^_</span>\u3002\u672c\u6587\uff08\u4ee5\u53ca\u672c\u5b9e\u73b0\uff09\u4f7f\u7528<strong>\u53d8\u538b\u5668\u6700\u540e\u4e00\u5c42\u524d\u9988\u5c42\u7684\u8f93\u5165</strong>\u4f5c\u4e3a<span translate=no>_^_10_^_</span>\u3002</p>\n<p>\u6211\u4eec\u4f7f\u7528 <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a> \u8fdb\u884c\u7d22\u5f15<span translate=no>_^_11_^_</span>\u3002</p>\n<h3>\u5b9e\u65bd</h3>\n<p>\u56e0\u6b64\uff0c\u8981\u8fd0\u884c<span translate=no>_^_12_^_</span> NN-LM\uff0c\u6211\u4eec\u9700\u8981\uff1a</p>\n<ul><li><a href=\"train_model.html\">\u8bad\u7ec3\u53d8\u538b\u5668\u6a21\u578b</a></li>\n<li><a href=\"build_index.html\">\u5efa\u7acb\u7d22\u5f15</a><span translate=no>_^_13_^_</span></li>\n<li>\u4f7f\u7528 <a href=\"eval_knn.html\">NN \u641c\u7d22\u6765\u8bc4\u4f30 k<span translate=no>_^_14_^_</span> nn-ML</a><span translate=no>_^_15_^_</span><span translate=no>_^_16_^_</span></li></ul>\n<p>\u8fd9\u4e2a\u5b9e\u9a8c\u4f7f\u7528\u4e86\u4e00\u4e2a\u5c0f\u6570\u636e\u96c6\uff0c\u8fd9\u6837\u6211\u4eec\u5c31\u53ef\u4ee5\u5728\u4e0d\u5360\u7528\u51e0\u767e\u5343\u5146\u5b57\u8282\u7684\u7d22\u5f15\u78c1\u76d8\u7a7a\u95f4\u7684\u60c5\u51b5\u4e0b\u8fd0\u884c\u5b83\u3002</p>\n<p><span translate=no>_^_17_^_</span>NN-LM \u7684\u5b98\u65b9\u5b9e\u73b0\u53ef\u4ee5<a href=\"https://github.com/urvashik/knnlm\">\u5728\u8fd9\u91cc</a>\u627e\u5230\u3002</p>\n",
|
||||
"<h1>k-Nearest Neighbor Language Models</h1>\n<p>This is a <a href=\"https://pytorch.org\">PyTorch</a> implementation of the paper <a href=\"https://arxiv.org/abs/1911.00172\">Generalization through Memorization: Nearest Neighbor Language Models</a>. It uses k-nearest neighbors to improve perplexity of autoregressive transformer models.</p>\n<p>An autoregressive language model estimates <span translate=no>_^_0_^_</span>, where <span translate=no>_^_1_^_</span> is the token at step <span translate=no>_^_2_^_</span> and <span translate=no>_^_3_^_</span> is the context, <span translate=no>_^_4_^_</span>.</p>\n<p>This paper, improves <span translate=no>_^_5_^_</span> using a k-nearest neighbor search on key-value pairs <span translate=no>_^_6_^_</span>, with search key <span translate=no>_^_7_^_</span>. Here <span translate=no>_^_8_^_</span> is an embedding of the context <span translate=no>_^_9_^_</span>. The paper (and this implementation) uses the <strong>input to the feed-forward layer of the final layer of the transformer</strong> as <span translate=no>_^_10_^_</span>.</p>\n<p>We use <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a> to index <span translate=no>_^_11_^_</span>.</p>\n<h3>Implementation</h3>\n<p>So to run <span translate=no>_^_12_^_</span>NN-LM we need to:</p>\n<ul><li><a href=\"train_model.html\">Train a transformer model</a> </li>\n<li><a href=\"build_index.html\">Build an index</a> of <span translate=no>_^_13_^_</span> </li>\n<li><a href=\"eval_knn.html\">Evaluate kNN-ML</a> using <span translate=no>_^_14_^_</span>NN seach on <span translate=no>_^_15_^_</span> with <span translate=no>_^_16_^_</span></li></ul>\n<p>This experiment uses a small dataset so that we can run this without using up a few hundred giga-bytes of disk space for the index.</p>\n<p>The official implementation of <span translate=no>_^_17_^_</span>NN-LM can be found <a href=\"https://github.com/urvashik/knnlm\">here</a>.</p>\n": "<h1>K \u8fd1\u90bb\u8bed\u8a00\u6a21\u578b</h1>\n<p>\u8fd9\u662f\u8bba\u6587\u300a<a href=\"https://arxiv.org/abs/1911.00172\">\u901a\u8fc7\u8bb0\u5fc6\u63a8\u5e7f\uff1a\u6700\u8fd1\u90bb\u8bed\u8a00\u6a21\u578b</a>\u300b\u7684 <a href=\"https://pytorch.org\">PyTorch</a> \u5b9e\u73b0\u3002\u5b83\u4f7f\u7528 k \u6700\u8fd1\u90bb\u6765\u6539\u5584\u81ea\u56de\u5f52\u53d8\u538b\u5668\u6a21\u578b\u7684\u56f0\u60d1\u5ea6\u3002</p>\n<p>\u81ea\u56de\u5f52\u8bed\u8a00\u6a21\u578b\u4f30\u8ba1<span translate=no>_^_0_^_</span>\uff0c\u6b65\u9aa4\u4e2d\u7684\u6807\u8bb0\u5728\u54ea\u91cc<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\uff0c<span translate=no>_^_3_^_</span>\u662f\u4e0a\u4e0b\u6587\uff0c<span translate=no>_^_4_^_</span>\u3002</p>\n<p>\u672c\u6587\u6539\u8fdb\u4e86<span translate=no>_^_5_^_</span>\u4f7f\u7528\u5e26\u641c\u7d22\u952e\u7684\u952e\u503c\u5bf9<span translate=no>_^_6_^_</span>\u4f7f\u7528 k \u6700\u8fd1\u90bb\u641c\u7d22\u7684\u529f\u80fd<span translate=no>_^_7_^_</span>\u3002<span translate=no>_^_8_^_</span>\u8fd9\u662f\u4e0a\u4e0b\u6587\u7684\u5d4c\u5165<span translate=no>_^_9_^_</span>\u3002\u672c\u6587\uff08\u4ee5\u53ca\u672c\u5b9e\u73b0\uff09\u4f7f\u7528<strong>\u53d8\u538b\u5668\u6700\u540e\u4e00\u5c42\u524d\u9988\u5c42\u7684\u8f93\u5165</strong>\u4f5c\u4e3a<span translate=no>_^_10_^_</span>\u3002</p>\n<p>\u6211\u4eec\u4f7f\u7528 <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a> \u8fdb\u884c\u7d22\u5f15<span translate=no>_^_11_^_</span>\u3002</p>\n<h3>\u5b9e\u65bd</h3>\n<p>\u56e0\u6b64\uff0c\u8981\u8fd0\u884c<span translate=no>_^_12_^_</span> NN-LM\uff0c\u6211\u4eec\u9700\u8981\uff1a</p>\n<ul><li><a href=\"train_model.html\">\u8bad\u7ec3\u53d8\u538b\u5668\u6a21\u578b</a></li>\n<li><a href=\"build_index.html\">\u5efa\u7acb\u7d22\u5f15</a><span translate=no>_^_13_^_</span></li>\n<li>\u4f7f\u7528 <a href=\"eval_knn.html\">NN \u641c\u7d22\u6765\u8bc4\u4f30 k<span translate=no>_^_14_^_</span> nn-ML</a><span translate=no>_^_15_^_</span><span translate=no>_^_16_^_</span></li></ul>\n<p>\u8fd9\u4e2a\u5b9e\u9a8c\u4f7f\u7528\u4e86\u4e00\u4e2a\u5c0f\u6570\u636e\u96c6\uff0c\u8fd9\u6837\u6211\u4eec\u5c31\u53ef\u4ee5\u5728\u4e0d\u5360\u7528\u51e0\u767e\u5343\u5146\u5b57\u8282\u7684\u7d22\u5f15\u78c1\u76d8\u7a7a\u95f4\u7684\u60c5\u51b5\u4e0b\u8fd0\u884c\u5b83\u3002</p>\n<p><span translate=no>_^_17_^_</span>NN-LM \u7684\u5b98\u65b9\u5b9e\u73b0\u53ef\u4ee5<a href=\"https://github.com/urvashik/knnlm\">\u5728\u8fd9\u91cc</a>\u627e\u5230\u3002</p>\n",
|
||||
"This is a simple PyTorch implementation/tutorial of the paper Generalization through Memorization: Nearest Neighbor Language Models using FAISS. It runs a kNN model on the final transformer layer embeddings to improve the loss of transformer based language models. It's also great for domain adaptation without pre-training.": "\u8fd9\u662f\u8bba\u6587\u300a\u8bb0\u5fc6\u6cdb\u5316\uff1a\u4f7f\u7528FAISS\u7684\u6700\u8fd1\u90bb\u8bed\u8a00\u6a21\u578b\u300b\u7684\u7b80\u5355PyTorch\u5b9e\u73b0/\u6559\u7a0b\u3002\u5b83\u5728\u6700\u7ec8\u7684\u53d8\u538b\u5668\u5c42\u5d4c\u5165\u4e0a\u8fd0\u884ckNN\u6a21\u578b\uff0c\u4ee5\u6539\u5584\u57fa\u4e8e\u53d8\u538b\u5668\u7684\u8bed\u8a00\u6a21\u578b\u7684\u635f\u8017\u3002\u5b83\u4e5f\u975e\u5e38\u9002\u5408\u65e0\u9700\u9884\u5148\u8bad\u7ec3\u7684\u9886\u57df\u9002\u5e94\u3002",
|
||||
"k-Nearest Neighbor Language Models": "K \u8fd1\u90bb\u8bed\u8a00\u6a21\u578b"
|
||||
}
|
||||
Reference in New Issue
Block a user