qlora で学習、ロードしたらsize mismatch

Posted at 2024-07-02

from peft import PeftModel,PeftConfig
from transformers import AutoModelForCausalLM,BitsAndBytesConfig,AutoTokenizer

bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_compute_dtype=torch.bfloat16,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type='nf4'
    )
    
model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    device_map='auto',
    torch_dtype=torch.bfloat16,
    quantization_config=bnb_config,
)

# <-- ここに

model = PeftModel.from_pretrained(
    model, 
    PEFT_MODEL, 
    device_map={"":0}
)

RuntimeError                              Traceback (most recent call last)
Cell In[9], line 1
----> 1 model = PeftModel.from_pretrained(
      2     model, 
      3     PEFT_MODEL, 
      4     device_map={"":0}
      5 )

File ~/pyenv/rag/lib/python3.11/site-packages/peft/peft_model.py:430, in PeftModel.from_pretrained(cls, model, model_id, adapter_name, is_trainable, config, **kwargs)
    428 else:
    429     model = MODEL_TYPE_TO_PEFT_MODEL_MAPPING[config.task_type](model, config, adapter_name)
--> 430 model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
    431 return model

File ~/pyenv/rag/lib/python3.11/site-packages/peft/peft_model.py:988, in PeftModel.load_adapter(self, model_id, adapter_name, is_trainable, torch_device, **kwargs)
    986 # load the weights into the model
    987 ignore_mismatched_sizes = kwargs.get("ignore_mismatched_sizes", False)
--> 988 load_result = set_peft_model_state_dict(
    989     self, adapters_weights, adapter_name=adapter_name, ignore_mismatched_sizes=ignore_mismatched_sizes
    990 )
    991 if (
    992     (getattr(self, "hf_device_map", None) is not None)
    993     and (len(set(self.hf_device_map.values()).intersection({"cpu", "disk"})) > 0)
    994     and len(self.peft_config) == 1
...
   2191 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
	size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
	size mismatch for base_model.model.lm_head.weight: copying a param with shape torch.Size([32001, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

あうあう。

解決編

peft で読む前に、 # <-- ここに　に、

model.resize_token_embeddings(32001)

で動いてる。大丈夫・・・なような
普通にバグですかね

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up