Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to run the lora finetuned model? #35

Open
dongwhfdyer opened this issue Apr 18, 2024 · 6 comments
Open

how to run the lora finetuned model? #35

dongwhfdyer opened this issue Apr 18, 2024 · 6 comments

Comments

@dongwhfdyer
Copy link

I have followed the instructions of finetune_lora.sh and got the trained model.

this is my finetune_lora.sh

#!/bin/bash

################## VICUNA ##################
PROMPT_VERSION=v1
MODEL_VERSION="vicuna-v1.5-7b"
gpu_ids=0,1,2,3
################## VICUNA ##################

 deepspeed --master_port=$((RANDOM + 10000)) --include localhost:$gpu_ids geochat/train/train_mem.py \
    --deepspeed ./scripts/zero2.json \
    --lora_enable True \
    --model_name_or_path pretrained_weights/llavav1.5-7b \
    --version $PROMPT_VERSION \
    --data_path ~/datasets/GeoChat_Instruct.json \
    --image_folder ~/datasets/GeoChat_finetuning/final_images_llava  \
    --vision_tower openai/clip-vit-large-patch14-336 \
    --mm_projector_type mlp2x_gelu \
    --pretrain_mm_mlp_adapter pretrained_weights/llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5/mm_projector.bin \
    --mm_vision_select_layer -2 \
    --mm_use_im_start_end False \
    --mm_use_im_patch_token False \
    --image_aspect_ratio pad \
    --bf16 True \
    --output_dir /nfs/geochat_output/checkpoints_dir \
    --num_train_epochs 1 \
    --per_device_train_batch_size 6 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --evaluation_strategy "no" \
    --save_strategy "epoch" \
    --save_steps 1000 \
    --save_total_limit 5 \
    --learning_rate 2e-4 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --lazy_preprocess True \
    --dataloader_num_workers 0 \
    --report_to wandb

here is the saved lora fine_tuned model.

(base) ➜  checkpoints_dir tree
.
├── adapter_config.json
├── adapter_model.bin
├── checkpoint-3217
│   ├── adapter_config.json
│   ├── adapter_model.bin
│   ├── global_step3217
│   │   ├── bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt
│   │   ├── bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt
│   │   ├── bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt
│   │   ├── bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt
│   │   └── mp_rank_00_model_states.pt
│   ├── latest
│   ├── README.md
│   ├── rng_state_0.pth
│   ├── rng_state_1.pth
│   ├── rng_state_2.pth
│   ├── rng_state_3.pth
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   ├── tokenizer.model
│   ├── trainer_state.json
│   ├── training_args.bin
│   └── zero_to_fp32.py
├── config.json
├── non_lora_trainables.bin
├── README.md
└── trainer_state.json


I don't know how to load this model, I didn't find it in readme.md. can anyone help me? Thank you!

@dongwhfdyer
Copy link
Author

now i know it. look at the llava project. you would find the two-stage weight-loading methods. if anyone still don't know, contact me

@lx709
Copy link

lx709 commented May 30, 2024

Thanks, @dongwhfdyer , I already figured it out.

@kartikey9254
Copy link

now i know it. look at the llava project. you would find the two-stage weight-loading methods. if anyone still don't know, contact me

hi there , i am trying out this model and the demo worked but when i used the lora.sh script for training it displays OSError: Error no file named pytorch_ Model. bin, tf_ Model. h5, model. ckpt. index or flex_ Model. msgpack found in directory/home/LaVA/lava v1.5-13b lora . can you guide me how can i train this model ?

@732259408
Copy link

@dongwhfdyer hi, In the finetune_lora.sh --pretrain_mm_mlp_adapter path/to/llava-v1.5-mlp2x-336px-pretrain-vicuna-7b-v1.5/mm_projector.bin, l have a issue. Is the mm_projector.bin file using weights from llava-v1.5-7b? I couldn't find mm_projector.bin in Geochat-7B.

@Amazingren
Copy link

hi @dongwhfdyer ,

It seems you already successfully reproduced this project.

I am still confused about the training procedure.

  • Do we only need to run the fintune_lora.sh and then merge the weight?
  • Or, we also need to do pretrain with pretrain.sh

It would be super nice to get some response from you.

Best and have a nice day,

@yt309
Copy link

yt309 commented Aug 23, 2024

now i know it. look at the llava project. you would find the two-stage weight-loading methods. if anyone still don't know, contact me
hi @dongwhfdyer ,
I'm very confused about how to evaluate the fine-tuned model, would it be convenient for you to provide more details on how to do it, thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants