mirror of https://github.com/VHellendoorn/Code-LMs.git synced 2025-12-19 02:00:16 +08:00

Files

wangxu 49d8698285 Update Convert2HF/README.md

Co-authored-by: Vincent Hellendoorn <1426353+VHellendoorn@users.noreply.github.com>

2022-10-18 13:36:26 +08:00

polycoder/configs

Extend to support models of different sizes, and add "argparse" and README.

2022-10-15 00:57:29 +08:00

convert_neox_pt_to_huggingface_neox.py

Extend to support models of different sizes, and add "argparse" and README.

2022-10-15 00:57:29 +08:00

convert.sh

Parameterize the model size

2022-10-18 13:28:50 +08:00

generate.py

Extend to support models of different sizes, and add "argparse" and README.

2022-10-15 00:57:29 +08:00

README.md

Update Convert2HF/README.md

2022-10-18 13:36:26 +08:00

README.md

Convert to HuggingFace

This directory contains a script convert_neox_pt_to_huggingface_neox.py to convert PolyCoder checkpoints trained by gpt-neox into HuggingFace format, and a script generate.py to load the converted model and generate code from a given prompt. Shoutout to @NinedayWang for implementing this!

Environment

transformers 4.23.1

Convert

You can use the convert.sh script to convert specified model to the HuggingFace format, using ./convert.sh 0-4B (or pass a different model size). This script in turn invokes convert_neox_pt_to_huggingface_neox.py, which you can also call directly as follows:

python convert_neox_pt_to_huggingface_neox.py \
    --checkpoint_dir ../checkpoints/checkpoints-0-4B/global_step150000 \
    --vocab_file ../Data/code-vocab.json \
    --merge_file ../Data/code-merges.txt \
    --hf_config_path ./polycoder/configs/config_0-4B.json \
    --hf_save_dir ./polycoder/0-4B

HuggingFace configuration files for different size models are provided in polycoder/configs/, including config_0-4B.json, config_2-7B.json and config_160M.json.

After running, you can get a complete HuggingFace model in the directory specified by hf_save_dir. If the directory does not exist, it can be built automatically.

Generate

The following is an example to load the converted 0.4B HuggingFace model and generate code from a given prompt:

python generate.py \
    --model_name_or_path ./polycoder/0-4B \
    --temperature 0.2 \
    --top_p 0.95 \
    --max_length 128

You can evaluate models of other sizes by specifying model_name_or_path.