ja translation

This commit is contained in:
Varuna Jayasiri
2023-05-10 17:00:29 -04:00
parent b05c9e0c57
commit 2038b11d29
515 changed files with 134069 additions and 0 deletions

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,41 @@
{
"<h1>Build FAISS index for k-NN search</h1>\n<p>We want to build the index of <span translate=no>_^_0_^_</span>. We store <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> in memory mapped numpy arrays. We find <span translate=no>_^_3_^_</span> nearest to <span translate=no>_^_4_^_</span> using <a href=\"https://github.com/facebookresearch/faiss\">FAISS</a>. FAISS indexes <span translate=no>_^_5_^_</span> and we query it with <span translate=no>_^_6_^_</span>.</p>\n": "<h1>k-NN \u691c\u7d22\u7528\u306e FAISS \u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u4f5c\u6210</h1>\n<p>\u306e\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u4f5c\u6210\u3057\u305f\u3044<span translate=no>_^_0_^_</span>.<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\u30e1\u30e2\u30ea\u306b\u30de\u30c3\u30d7\u3055\u308c\u305fnumpy\u914d\u5217\u3092\u683c\u7d0d\u3057\u307e\u3059\u3002<span translate=no>_^_3_^_</span><span translate=no>_^_4_^_</span><a href=\"https://github.com/facebookresearch/faiss\">FAISS\u3092\u4f7f\u3046\u306e\u306b\u4e00\u756a\u8fd1\u3044\u3068\u601d\u3044\u307e\u3059</a>\u3002<span translate=no>_^_5_^_</span>FAISS\u306f\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u4f5c\u6210\u3057\u3001\u30af\u30a8\u30ea\u3092\u5b9f\u884c\u3057\u307e\u3059\u3002<span translate=no>_^_6_^_</span></p>\n",
"<h2>Build FAISS index</h2>\n<p><a href=\"https://github.com/facebookresearch/faiss/wiki/Getting-started\">Getting started</a>, <a href=\"https://github.com/facebookresearch/faiss/wiki/Faster-search\">faster search</a>, and <a href=\"https://github.com/facebookresearch/faiss/wiki/Lower-memory-footprint\">lower memory footprint</a> tutorials on FAISS will help you learn more about FAISS usage.</p>\n": "<h2>FAISS \u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u30d3\u30eb\u30c9</h2>\n<p><a href=\"https://github.com/facebookresearch/faiss/wiki/Getting-started\">FAISS\u306e\u4f7f\u3044\u65b9</a>\u3001<a href=\"https://github.com/facebookresearch/faiss/wiki/Faster-search\">\u9ad8\u901f\u691c\u7d22</a>\u3001<a href=\"https://github.com/facebookresearch/faiss/wiki/Lower-memory-footprint\">\u30e1\u30e2\u30ea\u4f7f\u7528\u91cf\u306e\u524a\u6e1b\u306b\u95a2\u3059\u308b\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u306f</a>\u3001FAISS\u306e\u4f7f\u7528\u6cd5\u306b\u3064\u3044\u3066\u3055\u3089\u306b\u5b66\u3076\u306e\u306b\u5f79\u7acb\u3061\u307e\u3059\u3002</p>\n",
"<h2>Gather <span translate=no>_^_0_^_</span> and save them in numpy arrays</h2>\n<p><em>Note that these numpy arrays will take up a lot of space (even few hundred gigabytes) depending on the size of your dataset</em>.</p>\n": "<h2><span translate=no>_^_0_^_</span>\u305d\u308c\u3089\u3092\u96c6\u3081\u3066numpy\u914d\u5217\u306b\u4fdd\u5b58\u3059\u308b</h2>\n<p><em>\u3053\u308c\u3089\u306e\u5927\u91cf\u306e\u914d\u5217\u306f\u3001\u30c7\u30fc\u30bf\u30bb\u30c3\u30c8\u306e\u30b5\u30a4\u30ba\u306b\u3082\u3088\u308a\u307e\u3059\u304c\u3001\uff08\u6570\u767e\u30ae\u30ac\u30d0\u30a4\u30c8\u3067\u3082\uff09\u591a\u304f\u306e\u30b9\u30da\u30fc\u30b9\u3092\u5360\u3081\u308b\u3053\u3068\u306b\u6ce8\u610f\u3057\u3066\u304f\u3060\u3055\u3044</em>\u3002</p>\n",
"<p> Load a saved experiment from <a href=\"train_model.html\">train model</a>.</p>\n": "<p><a href=\"train_model.html\">\u4fdd\u5b58\u3057\u305f\u5b9f\u9a13\u3092\u30c8\u30ec\u30a4\u30f3\u30e2\u30c7\u30eb\u304b\u3089\u8aad\u307f\u8fbc\u307f\u307e\u3059</a>\u3002</p>\n",
"<p><span translate=no>_^_0_^_</span> </p>\n": "<p><span translate=no>_^_0_^_</span></p>\n",
"<p><span translate=no>_^_0_^_</span> the target labels </p>\n": "<p><span translate=no>_^_0_^_</span>\u30bf\u30fc\u30b2\u30c3\u30c8\u30e9\u30d9\u30eb</p>\n",
"<p>Add keys to the index; <span translate=no>_^_0_^_</span> </p>\n": "<p>\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u306b\u30ad\u30fc\u3092\u8ffd\u52a0\u3057\u307e\u3059\u3002<span translate=no>_^_0_^_</span></p>\n",
"<p>Add them to the index for fast search </p>\n": "<p>\u7d22\u5f15\u306b\u8ffd\u52a0\u3059\u308b\u3068\u3059\u3070\u3084\u304f\u691c\u7d22\u3067\u304d\u307e\u3059</p>\n",
"<p>Add to index </p>\n": "<p>\u7d22\u5f15\u306b\u8ffd\u52a0</p>\n",
"<p>Build an index with Verenoi cell based faster search with compression that doesn&#x27;t store full vectors. </p>\n": "<p>Verenoi \u306e\u30bb\u30eb\u30d9\u30fc\u30b9\u306e\u9ad8\u901f\u691c\u7d22\u3067\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u69cb\u7bc9\u3057\u307e\u3059\u3002\u5727\u7e2e\u3067\u306f\u30d9\u30af\u30c8\u30eb\u5168\u4f53\u306f\u4fdd\u5b58\u3055\u308c\u307e\u305b\u3093\u3002</p>\n",
"<p>Collect <span translate=no>_^_0_^_</span> </p>\n": "<p>\u53ce\u96c6 <span translate=no>_^_0_^_</span></p>\n",
"<p>Create configurations object </p>\n": "<p>\u8a2d\u5b9a\u30aa\u30d6\u30b8\u30a7\u30af\u30c8\u306e\u4f5c\u6210</p>\n",
"<p>Dimensions of <span translate=no>_^_0_^_</span> </p>\n": "<p>\u306e\u5bf8\u6cd5 <span translate=no>_^_0_^_</span></p>\n",
"<p>Get <span translate=no>_^_0_^_</span> </p>\n": "<p>\u53d6\u5f97 <span translate=no>_^_0_^_</span></p>\n",
"<p>Increment the number of collected keys </p>\n": "<p>\u53ce\u96c6\u3057\u305f\u30ad\u30fc\u306e\u6570\u3092\u5897\u3084\u3057\u3066\u304f\u3060\u3055\u3044</p>\n",
"<p>Initialize configurations </p>\n": "<p>\u69cb\u6210\u3092\u521d\u671f\u5316</p>\n",
"<p>Input data moved to the device of the model </p>\n": "<p>\u5165\u529b\u30c7\u30fc\u30bf\u3092\u30e2\u30c7\u30eb\u306e\u30c7\u30d0\u30a4\u30b9\u306b\u79fb\u52d5</p>\n",
"<p>Load custom configurations used in the experiment </p>\n": "<p>\u5b9f\u9a13\u3067\u4f7f\u7528\u3057\u305f\u30ab\u30b9\u30bf\u30e0\u69cb\u6210\u3092\u8aad\u307f\u8fbc\u3080</p>\n",
"<p>Load the experiment. Replace the run uuid with you run uuid from <a href=\"train_model.html\">training the model</a>. </p>\n": "<p>\u5b9f\u9a13\u3092\u30ed\u30fc\u30c9\u3057\u307e\u3059\u3002run UUID <a href=\"train_model.html\">\u3092\u30e2\u30c7\u30eb\u306e\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u304b\u3089\u53d6\u5f97\u3057\u305frun uuid \u306b\u7f6e\u304d\u63db\u3048\u3066\u304f\u3060\u3055\u3044\u3002</a></p>\n",
"<p>Load the memory mapped numpy array of keys </p>\n": "<p>\u30e1\u30e2\u30ea\u30de\u30c3\u30d7\u3055\u308c\u305f\u30ad\u30fc\u306enumpy\u914d\u5217\u3092\u30ed\u30fc\u30c9\u3057\u307e\u3059</p>\n",
"<p>Loop through data </p>\n": "<p>\u30c7\u30fc\u30bf\u3092\u30eb\u30fc\u30d7\u30b9\u30eb\u30fc\u3059\u308b</p>\n",
"<p>Number of contexts; i.e. number of tokens in the training data minus one. <span translate=no>_^_0_^_</span> for <span translate=no>_^_1_^_</span> </p>\n": "<p>\u30b3\u30f3\u30c6\u30ad\u30b9\u30c8\u306e\u6570\u3002\u3064\u307e\u308a\u3001\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u30c7\u30fc\u30bf\u5185\u306e\u30c8\u30fc\u30af\u30f3\u6570\u304b\u3089 1 \u3092\u5f15\u3044\u305f\u6570\u3067\u3059\u3002<span translate=no>_^_0_^_</span></p>\u306b\u3068\u3063\u3066 <span translate=no>_^_1_^_</span>\n",
"<p>Number of keys <span translate=no>_^_0_^_</span> collected </p>\n": "<p><span translate=no>_^_0_^_</span>\u53ce\u96c6\u3055\u308c\u305f\u30ad\u30fc\u306e\u6570</p>\n",
"<p>Numpy array for <span translate=no>_^_0_^_</span> </p>\n": "<p>\u306e\u30ca\u30f3\u30d4\u30fc\u914d\u5217 <span translate=no>_^_0_^_</span></p>\n",
"<p>Pick a random sample of keys to train the index with </p>\n": "<p>\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u306e\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u306b\u4f7f\u7528\u3059\u308b\u30ad\u30fc\u306e\u30b5\u30f3\u30d7\u30eb\u3092\u30e9\u30f3\u30c0\u30e0\u306b\u9078\u3093\u3067\u304f\u3060\u3055\u3044</p>\n",
"<p>Run the model </p>\n": "<p>\u30e2\u30c7\u30eb\u3092\u5b9f\u884c</p>\n",
"<p>Save keys, <span translate=no>_^_0_^_</span> in the memory mapped numpy array </p>\n": "<p>\u30ad\u30fc\u3092\u30e1\u30e2\u30ea\u30de\u30c3\u30d7\u3055\u308c\u305fnumpy\u914d\u5217\u306b\u4fdd\u5b58 <span translate=no>_^_0_^_</span></p>\n",
"<p>Save the index </p>\n": "<p>\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u4fdd\u5b58\u3059\u308b</p>\n",
"<p>Save values, <span translate=no>_^_0_^_</span> in the memory mapped numpy array </p>\n": "<p>\u5024\u3092\u30e1\u30e2\u30ea\u30de\u30c3\u30d7\u3055\u308c\u305fnumpy\u914d\u5217\u306b\u4fdd\u5b58\u3059\u308b <span translate=no>_^_0_^_</span></p>\n",
"<p>Set model to evaluation mode </p>\n": "<p>\u30e2\u30c7\u30eb\u3092\u8a55\u4fa1\u30e2\u30fc\u30c9\u306b\u8a2d\u5b9a</p>\n",
"<p>Set models for saving/loading </p>\n": "<p>\u4fdd\u5b58/\u8aad\u307f\u8fbc\u307f\u7528\u306e\u30e2\u30c7\u30eb\u3092\u8a2d\u5b9a</p>\n",
"<p>Specify the experiment to load from </p>\n": "<p>\u30ed\u30fc\u30c9\u5143\u306e\u30c6\u30b9\u30c8\u3092\u6307\u5b9a\u3057\u3066\u304f\u3060\u3055\u3044</p>\n",
"<p>Start the experiment; this is when it actually loads models </p>\n": "<p>\u5b9f\u9a13\u3092\u958b\u59cb\u3057\u307e\u3059\u3002\u3053\u306e\u6642\u70b9\u3067\u3001\u5b9f\u969b\u306b\u30e2\u30c7\u30eb\u304c\u8aad\u307f\u8fbc\u307e\u308c\u307e\u3059\u3002</p>\n",
"<p>This experiment is just an evaluation; i.e. nothing is tracked or saved </p>\n": "<p>\u3053\u306e\u5b9f\u9a13\u306f\u5358\u306a\u308b\u8a55\u4fa1\u3067\u3059\u3002\u3064\u307e\u308a\u3001\u4f55\u3082\u8ffd\u8de1\u3082\u4fdd\u5b58\u3082\u3055\u308c\u3066\u3044\u307e\u305b\u3093</p>\n",
"<p>Train the index to store the keys </p>\n": "<p>\u30ad\u30fc\u3092\u4fdd\u5b58\u3059\u308b\u3088\u3046\u306b\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u3059\u308b</p>\n",
"<p>Training data loader </p>\n": "<p>\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u30c7\u30fc\u30bf\u30ed\u30fc\u30c0\u30fc</p>\n",
"<p>We need to get inputs to the feed forward layer, <span translate=no>_^_0_^_</span> </p>\n": "<p>\u30d5\u30a3\u30fc\u30c9\u30d5\u30a9\u30ef\u30fc\u30c9\u5c64\u3078\u306e\u5165\u529b\u304c\u5fc5\u8981\u3067\u3059\u304c <span translate=no>_^_0_^_</span></p>\n",
"Build FAISS index for k-NN search": "k-NN \u691c\u7d22\u7528\u306e FAISS \u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u4f5c\u6210",
"This builds the FAISS index with the transformer embeddings.": "\u3053\u308c\u306b\u3088\u308a\u3001\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u304c\u57cb\u3081\u8fbc\u307e\u308c\u305f FAISS \u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u304c\u4f5c\u6210\u3055\u308c\u307e\u3059\u3002"
}

View File

@ -0,0 +1,40 @@
{
"<h1>Evaluate k-nearest neighbor language model</h1>\n": "<h1>k-\u6700\u8fd1\u508d\u8a00\u8a9e\u30e2\u30c7\u30eb\u306e\u8a55\u4fa1</h1>\n",
"<h2><span translate=no>_^_0_^_</span>-NN to get <span translate=no>_^_1_^_</span></h2>\n<p>Here we refer to <span translate=no>_^_2_^_</span> as queries, <span translate=no>_^_3_^_</span> as keys and <span translate=no>_^_4_^_</span> as values.</p>\n": "<h2><span translate=no>_^_0_^_</span>-NN \u3067\u53d6\u5f97 <span translate=no>_^_1_^_</span></h2>\n<p>\u3053\u3053\u3067\u306f\u3001\u30af\u30a8\u30ea\u3001<span translate=no>_^_3_^_</span>\u30ad\u30fc\u3001<span translate=no>_^_4_^_</span>\u5024\u3068\u547c\u3073\u307e\u3059\u3002<span translate=no>_^_2_^_</span></p>\n",
"<h2>Calculate validation loss</h2>\n<p>We calculate the validation loss of the combined on <span translate=no>_^_0_^_</span>-NN prediction and transformer prediction. The weight given to the <span translate=no>_^_1_^_</span>-NN model is given by <span translate=no>_^_2_^_</span>. It&#x27;s a list of weights and we calculate the validation loss for each.</p>\n": "<h2>\u691c\u8a3c\u640d\u5931\u306e\u8a08\u7b97</h2>\n<p><span translate=no>_^_0_^_</span>-NN \u4e88\u6e2c\u3068\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u4e88\u6e2c\u3092\u7d44\u307f\u5408\u308f\u305b\u305f\u5834\u5408\u306e\u691c\u8a3c\u640d\u5931\u3092\u8a08\u7b97\u3057\u307e\u3059\u3002<span translate=no>_^_1_^_</span>-NN \u30e2\u30c7\u30eb\u306b\u4e0e\u3048\u3089\u308c\u308b\u91cd\u307f\u306f\u3067\u4e0e\u3048\u3089\u308c\u307e\u3059\u3002<span translate=no>_^_2_^_</span>\u3053\u308c\u306f\u91cd\u307f\u306e\u30ea\u30b9\u30c8\u3067\u3001\u305d\u308c\u305e\u308c\u306e\u691c\u8a3c\u640d\u5931\u3092\u8a08\u7b97\u3057\u307e\u3059</p>\u3002\n",
"<h2>Load the index</h2>\n": "<h2>\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u8aad\u307f\u8fbc\u3080</h2>\n",
"<p>Calculate scores for each of <span translate=no>_^_0_^_</span>. </p>\n": "<p>\u305d\u308c\u305e\u308c\u306e\u30b9\u30b3\u30a2\u3092\u8a08\u7b97\u3057\u307e\u3059<span translate=no>_^_0_^_</span>\u3002</p>\n",
"<p>Calculate the loss </p>\n": "<p>\u640d\u5931\u306e\u8a08\u7b97</p>\n",
"<p>Dimensions of <span translate=no>_^_0_^_</span> </p>\n": "<p>\u306e\u5bf8\u6cd5 <span translate=no>_^_0_^_</span></p>\n",
"<p>Evaluate validation loss </p>\n": "<p>\u691c\u8a3c\u640d\u5931\u306e\u8a55\u4fa1</p>\n",
"<p>Find 10 nearest neighbors of <span translate=no>_^_0_^_</span> among <span translate=no>_^_1_^_</span>. <span translate=no>_^_2_^_</span> is the distance given by FAISS and <span translate=no>_^_3_^_</span>, <span translate=no>_^_4_^_</span> is the index of it in <span translate=no>_^_5_^_</span>. </p>\n": "<p><span translate=no>_^_0_^_</span><span translate=no>_^_1_^_</span>\u305d\u306e\u4e2d\u304b\u3089\u6700\u3082\u8fd1\u3044\u96a3\u4eba\u309210\u4eba\u898b\u3064\u3051\u308b\u3002<span translate=no>_^_2_^_</span>\u306fFAISS\u3067\u4e0e\u3048\u3089\u308c\u305f\u8ddd\u96e2\u3067<span translate=no>_^_3_^_</span>\u3001<span translate=no>_^_4_^_</span>\u306f\u305d\u306e\u8ddd\u96e2\u306e\u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3067\u3059</p>\u3002<span translate=no>_^_5_^_</span>\n",
"<p>Flatten the <span translate=no>_^_0_^_</span> and <span translate=no>_^_1_^_</span> dimensions of queries </p>\n": "<p><span translate=no>_^_0_^_</span><span translate=no>_^_1_^_</span>\u30af\u30a8\u30ea\u306e\u6b21\u5143\u3068\u6b21\u5143\u3092\u5e73\u5766\u5316</p>\n",
"<p>Get <span translate=no>_^_0_^_</span> </p>\n": "<p>\u53d6\u5f97 <span translate=no>_^_0_^_</span></p>\n",
"<p>Get <span translate=no>_^_0_^_</span>-NN predictions </p>\n": "<p><span translate=no>_^_0_^_</span>-NN \u4e88\u6e2c\u3092\u53d6\u5f97</p>\n",
"<p>Get data and target labels </p>\n": "<p>\u30c7\u30fc\u30bf\u3068\u30bf\u30fc\u30b2\u30c3\u30c8\u30e9\u30d9\u30eb\u3092\u53d6\u5f97</p>\n",
"<p>Get the dot-product, or cosine similarity </p>\n": "<p>\u70b9\u7a4d\u307e\u305f\u306f\u30b3\u30b5\u30a4\u30f3\u985e\u4f3c\u5ea6\u3092\u6c42\u3081\u308b</p>\n",
"<p>Iterate through validation data </p>\n": "<p>\u691c\u8a3c\u30c7\u30fc\u30bf\u3092\u7e70\u308a\u8fd4\u3057\u51e6\u7406</p>\n",
"<p>List of losses for each <span translate=no>_^_0_^_</span> </p>\n": "<p>\u305d\u308c\u305e\u308c\u306e\u640d\u5931\u306e\u30ea\u30b9\u30c8 <span translate=no>_^_0_^_</span></p>\n",
"<p>List of weights given to <span translate=no>_^_0_^_</span>-NN prediction. We will evaluate the validation loss for each of the weights </p>\n": "<p><span translate=no>_^_0_^_</span>-NN \u4e88\u6e2c\u306b\u4e0e\u3048\u3089\u308c\u308b\u91cd\u307f\u306e\u30ea\u30b9\u30c8\u3002\u305d\u308c\u305e\u308c\u306e\u91cd\u307f\u306e\u691c\u8a3c\u640d\u5931\u3092\u8a55\u4fa1\u3057\u307e\u3059\u3002</p>\n",
"<p>Load FAISS index </p>\n": "<p>FAISS \u30a4\u30f3\u30c7\u30c3\u30af\u30b9\u3092\u30ed\u30fc\u30c9</p>\n",
"<p>Load index </p>\n": "<p>\u30ed\u30fc\u30c9\u30a4\u30f3\u30c7\u30c3\u30af\u30b9</p>\n",
"<p>Load memory mapped numpy arrays </p>\n": "<p>\u30e1\u30e2\u30ea\u30de\u30c3\u30d7\u3055\u308c\u305f numpy \u914d\u5217\u3092\u30ed\u30fc\u30c9</p>\n",
"<p>Load the experiment. Replace the run uuid with you run uuid from <a href=\"train_model.html\">training the model</a>. </p>\n": "<p>\u5b9f\u9a13\u3092\u30ed\u30fc\u30c9\u3057\u307e\u3059\u3002run UUID <a href=\"train_model.html\">\u3092\u30e2\u30c7\u30eb\u306e\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u304b\u3089\u53d6\u5f97\u3057\u305frun uuid \u306b\u7f6e\u304d\u63db\u3048\u3066\u304f\u3060\u3055\u3044\u3002</a></p>\n",
"<p>Normalize <span translate=no>_^_0_^_</span> </p>\n": "<p>\u30ce\u30fc\u30de\u30e9\u30a4\u30ba <span translate=no>_^_0_^_</span></p>\n",
"<p>Number of contexts; i.e. number of tokens in the training data minus one. <span translate=no>_^_0_^_</span> for <span translate=no>_^_1_^_</span> </p>\n": "<p>\u30b3\u30f3\u30c6\u30ad\u30b9\u30c8\u306e\u6570\u3002\u3064\u307e\u308a\u3001\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u30c7\u30fc\u30bf\u5185\u306e\u30c8\u30fc\u30af\u30f3\u6570\u304b\u3089 1 \u3092\u5f15\u3044\u305f\u6570\u3067\u3059\u3002<span translate=no>_^_0_^_</span></p>\u306b\u3068\u3063\u3066 <span translate=no>_^_1_^_</span>\n",
"<p>Number of samples </p>\n": "<p>\u30b5\u30f3\u30d7\u30eb\u6570</p>\n",
"<p>Number of samples in each batch </p>\n": "<p>\u5404\u30d0\u30c3\u30c1\u306e\u30b5\u30f3\u30d7\u30eb\u6570</p>\n",
"<p>Output the losses for each of <span translate=no>_^_0_^_</span>. </p>\n": "<p>\u305d\u308c\u305e\u308c\u306e\u640d\u5931\u3092\u51fa\u529b\u3057\u307e\u3059<span translate=no>_^_0_^_</span>\u3002</p>\n",
"<p>Reshape the logits </p>\n": "<p>\u30ed\u30b8\u30c3\u30c8\u306e\u5f62\u72b6\u3092\u5909\u3048\u308b</p>\n",
"<p>Run the model and get predictions <span translate=no>_^_0_^_</span> </p>\n": "<p>\u30e2\u30c7\u30eb\u3092\u5b9f\u884c\u3057\u3066\u4e88\u6e2c\u3092\u53d6\u5f97 <span translate=no>_^_0_^_</span></p>\n",
"<p>Save shape of queries to reshape results </p>\n": "<p>\u30af\u30a8\u30ea\u306e\u5f62\u72b6\u3092\u4fdd\u5b58\u3057\u3066\u7d50\u679c\u306e\u5f62\u72b6\u3092\u5909\u3048\u308b</p>\n",
"<p>Scatter and accumulate token logits based on the nearest neighbors </p>\n": "<p>\u6700\u3082\u8fd1\u3044\u96a3\u4eba\u306b\u57fa\u3065\u3044\u3066\u30c8\u30fc\u30af\u30f3\u30ed\u30b8\u30c3\u30c8\u3092\u5206\u6563\u3057\u3066\u84c4\u7a4d\u3059\u308b</p>\n",
"<p>Set model to evaluation mode </p>\n": "<p>\u30e2\u30c7\u30eb\u3092\u8a55\u4fa1\u30e2\u30fc\u30c9\u306b\u8a2d\u5b9a</p>\n",
"<p>Set number of cells to probe </p>\n": "<p>\u30d7\u30ed\u30fc\u30d6\u3059\u308b\u30bb\u30eb\u306e\u6570\u3092\u8a2d\u5b9a</p>\n",
"<p>This is to calculate only the loss for <span translate=no>_^_0_^_</span> tokens. This is important because the first predictions (along the sequence) of transformer model has very few past tokens to look at. </p>\n": "<p><span translate=no>_^_0_^_</span>\u3053\u308c\u306f\u30c8\u30fc\u30af\u30f3\u306e\u640d\u5931\u306e\u307f\u3092\u8a08\u7b97\u3059\u308b\u305f\u3081\u306e\u3082\u306e\u3067\u3059\u3002\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u30e2\u30c7\u30eb\u306e\uff08\u30b7\u30fc\u30b1\u30f3\u30b9\u306b\u6cbf\u3063\u305f\uff09\u6700\u521d\u306e\u4e88\u6e2c\u3067\u306f\u3001\u8abf\u3079\u308b\u3079\u304d\u904e\u53bb\u306e\u30c8\u30fc\u30af\u30f3\u304c\u307b\u3068\u3093\u3069\u306a\u3044\u305f\u3081\u3001\u3053\u308c\u306f\u91cd\u8981\u3067\u3059</p>\u3002\n",
"<p>Token-wise logits </p>\n": "<p>\u30c8\u30fc\u30af\u30f3\u3054\u3068\u306e\u30ed\u30b8\u30c3\u30c8</p>\n",
"<p>Training data loader </p>\n": "<p>\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u30c7\u30fc\u30bf\u30ed\u30fc\u30c0\u30fc</p>\n",
"<p>We are going to calculate the cosine similarity between normalized vectors </p>\n": "<p>\u6b63\u898f\u5316\u3055\u308c\u305f\u30d9\u30af\u30c8\u30eb\u9593\u306e\u30b3\u30b5\u30a4\u30f3\u985e\u4f3c\u5ea6\u3092\u8a08\u7b97\u3057\u307e\u3059</p>\n",
"Evaluate k-nearest neighbor language model": "k-\u6700\u8fd1\u508d\u8a00\u8a9e\u30e2\u30c7\u30eb\u306e\u8a55\u4fa1",
"This runs the kNN model and merges the kNN results with transformer output to achieve better results than just using the transformer.": "\u3053\u308c\u306b\u3088\u308a kNN \u30e2\u30c7\u30eb\u304c\u5b9f\u884c\u3055\u308c\u3001kNN \u306e\u7d50\u679c\u304c\u30c8\u30e9\u30f3\u30b9\u51fa\u529b\u3068\u30de\u30fc\u30b8\u3055\u308c\u308b\u305f\u3081\u3001\u30c8\u30e9\u30f3\u30b9\u3060\u3051\u3092\u4f7f\u7528\u3059\u308b\u3088\u308a\u3082\u512a\u308c\u305f\u7d50\u679c\u304c\u5f97\u3089\u308c\u307e\u3059\u3002"
}

View File

@ -0,0 +1,29 @@
{
"<h1>Train Autoregressive Transformer</h1>\n<p>This trains a simple <a href=\"../../\">transformer</a> model for auto-regression.</p>\n": "<h1>\u5217\u8eca\u306e\u81ea\u5df1\u56de\u5e30\u5909\u5727\u5668</h1>\n</a><p>\u3053\u308c\u306b\u3088\u308a\u3001<a href=\"../../\">\u81ea\u52d5\u56de\u5e30\u7528\u306e\u5358\u7d14\u306a\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u30e2\u30c7\u30eb\u304c\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u3055\u308c\u307e\u3059\u3002</p>\n",
"<h2>Auto regressive model</h2>\n": "<h2>\u81ea\u52d5\u56de\u5e30\u30e2\u30c7\u30eb</h2>\n",
"<h2>Configurations</h2>\n<p>The default configs can and will be over-ridden when we start the experiment</p>\n": "<h2>\u30b3\u30f3\u30d5\u30a3\u30ae\u30e5\u30ec\u30fc\u30b7\u30e7\u30f3</h2>\n<p>\u30c7\u30d5\u30a9\u30eb\u30c8\u306e\u8a2d\u5b9a\u306f\u3001\u5b9f\u9a13\u3092\u958b\u59cb\u3057\u305f\u3068\u304d\u306b\u4e0a\u66f8\u304d\u3067\u304d\u3001\u307e\u305f\u4e0a\u66f8\u304d\u3055\u308c\u307e\u3059\u3002</p>\n",
"<p> Initialize the auto-regressive model</p>\n": "<p>\u81ea\u5df1\u56de\u5e30\u30e2\u30c7\u30eb\u3092\u521d\u671f\u5316</p>\n",
"<p> Initialize the configurable transformer encoder for our autoregressive model</p>\n": "<p>\u81ea\u5df1\u56de\u5e30\u30e2\u30c7\u30eb\u306e\u8a2d\u5b9a\u53ef\u80fd\u306a\u30c8\u30e9\u30f3\u30b9\u30a8\u30f3\u30b3\u30fc\u30c0\u30fc\u3092\u521d\u671f\u5316\u3057\u307e\u3059\u3002</p>\n",
"<p> Retrieve saved <span translate=no>_^_0_^_</span></p>\n": "<p>\u691c\u7d22\u4fdd\u5b58\u6e08\u307f <span translate=no>_^_0_^_</span></p>\n",
"<p><span translate=no>_^_0_^_</span> </p>\n": "<p><span translate=no>_^_0_^_</span></p>\n",
"<p>A dictionary of configurations to override </p>\n": "<p>\u30aa\u30fc\u30d0\u30fc\u30e9\u30a4\u30c9\u3059\u308b\u8a2d\u5b9a\u306e\u8f9e\u66f8</p>\n",
"<p>Create configs </p>\n": "<p>\u30b3\u30f3\u30d5\u30a3\u30b0\u306e\u4f5c\u6210</p>\n",
"<p>Create experiment </p>\n": "<p>\u5b9f\u9a13\u3092\u4f5c\u6210</p>\n",
"<p>Create subsequent mask, so that the transformer can only pay attention to past tokens. </p>\n": "<p>\u6b21\u306e\u30de\u30b9\u30af\u3092\u4f5c\u6210\u3057\u3066\u3001\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u304c\u904e\u53bb\u306e\u30c8\u30fc\u30af\u30f3\u306b\u3057\u304b\u6ce8\u76ee\u3067\u304d\u306a\u3044\u3088\u3046\u306b\u3057\u307e\u3059\u3002</p>\n",
"<p>Embed the tokens (<span translate=no>_^_0_^_</span>) and run it through the the transformer </p>\n": "<p>\u30c8\u30fc\u30af\u30f3 (<span translate=no>_^_0_^_</span>) \u3092\u57cb\u3081\u8fbc\u307f\u3001\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u306b\u901a\u3057\u307e\u3059</p>\n",
"<p>Generate logits of the next token </p>\n": "<p>\u6b21\u306e\u30c8\u30fc\u30af\u30f3\u306e\u30ed\u30b8\u30c3\u30c8\u3092\u751f\u6210</p>\n",
"<p>Get the source token embedding layer, encoder and final token generator from configurable transformer </p>\n": "<p>\u8a2d\u5b9a\u53ef\u80fd\u306a\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u304b\u3089\u30bd\u30fc\u30b9\u30c8\u30fc\u30af\u30f3\u57cb\u3081\u8fbc\u307f\u30ec\u30a4\u30e4\u30fc\u3001\u30a8\u30f3\u30b3\u30fc\u30c0\u30fc\u3001\u6700\u7d42\u30c8\u30fc\u30af\u30f3\u30b8\u30a7\u30cd\u30ec\u30fc\u30bf\u30fc\u3092\u53d6\u5f97</p>\n",
"<p>Load configurations </p>\n": "<p>\u69cb\u6210\u3092\u30ed\u30fc\u30c9</p>\n",
"<p>Next token generation layer; this give logits of the the next token </p>\n": "<p>\u6b21\u306e\u30c8\u30fc\u30af\u30f3\u751f\u6210\u30ec\u30a4\u30e4\u30fc\u3002\u3053\u308c\u306b\u3088\u308a\u3001\u6b21\u306e\u30c8\u30fc\u30af\u30f3\u306e\u30ed\u30b8\u30c3\u30c8\u304c\u8fd4\u3055\u308c\u307e\u3059</p>\n",
"<p>Set models for saving and loading </p>\n": "<p>\u4fdd\u5b58\u304a\u3088\u3073\u8aad\u307f\u8fbc\u307f\u7528\u306e\u30e2\u30c7\u30eb\u3092\u8a2d\u5b9a\u3059\u308b</p>\n",
"<p>Start the experiment </p>\n": "<p>\u5b9f\u9a13\u3092\u59cb\u3081\u308b</p>\n",
"<p>This is needed to initialize models </p>\n": "<p>\u3053\u308c\u306f\u30e2\u30c7\u30eb\u3092\u521d\u671f\u5316\u3059\u308b\u305f\u3081\u306b\u5fc5\u8981\u3067\u3059</p>\n",
"<p>This will be initialized on the first call </p>\n": "<p>\u3053\u308c\u306f\u6700\u521d\u306e\u547c\u3073\u51fa\u3057\u3067\u521d\u671f\u5316\u3055\u308c\u307e\u3059\u3002</p>\n",
"<p>Token embedding module </p>\n": "<p>\u30c8\u30fc\u30af\u30f3\u57cb\u3081\u8fbc\u307f\u30e2\u30b8\u30e5\u30fc\u30eb</p>\n",
"<p>Transformer based encoder </p>\n": "<p>\u30c8\u30e9\u30f3\u30b9\u30d9\u30fc\u30b9\u306e\u30a8\u30f3\u30b3\u30fc\u30c0</p>\n",
"<p>Transformer configurations </p>\n": "<p>\u5909\u5727\u5668\u69cb\u6210</p>\n",
"<p>Whether the last layer of the encoder should save the input to the feed-forward layer. This is out <span translate=no>_^_0_^_</span>, the embedding of the context. </p>\n": "<p>\u30a8\u30f3\u30b3\u30fc\u30c0\u30fc\u306e\u6700\u5f8c\u306e\u5c64\u3067\u5165\u529b\u3092\u30d5\u30a3\u30fc\u30c9\u30d5\u30a9\u30ef\u30fc\u30c9\u5c64\u306b\u4fdd\u5b58\u3059\u308b\u304b\u3069\u3046\u304b\u3002\u3053\u308c\u3067\u51fa\u307e\u3057\u305f<span translate=no>_^_0_^_</span>\u3002\u30b3\u30f3\u30c6\u30ad\u30b9\u30c8\u306e\u57cb\u3081\u8fbc\u307f\u3067\u3059</p>\u3002\n",
"<p>Whether to save <span translate=no>_^_0_^_</span> </p>\n": "<p>\u4fdd\u5b58\u3059\u308b\u304b\u3069\u3046\u304b <span translate=no>_^_0_^_</span></p>\n",
"This is training code with notes for a basic auto-regressive transformer.": "\u3053\u308c\u306f\u3001\u57fa\u672c\u7684\u306a\u81ea\u5df1\u56de\u5e30\u5909\u63db\u5668\u306e\u30e1\u30e2\u3092\u542b\u3080\u30c8\u30ec\u30fc\u30cb\u30f3\u30b0\u30b3\u30fc\u30c9\u3067\u3059\u3002",
"Train Autoregressive Transformer": "\u5217\u8eca\u306e\u81ea\u5df1\u56de\u5e30\u5909\u5727\u5668"
}