System Info
Reproduced with:
- peft: 0.19.1
- accelerate: 1.13.0
- transformers: 5.10.2
- torch: 2.12.0+cu132
- safetensors: 0.7.0
- Python: 3.10.20
- Platform: Linux / WSL2, x86_64
- CUDA available: yes, CUDA 13.2
I also reproduced this against PEFT main from source:
- commit: aa2b673
- version reported: 0.19.2.dev0
Who can help?
@BenjaminBossan @githubnemo
Reproduction
LoraConfig(bias="lora_only") correctly marks the wrapped base layer bias as trainable, but get_peft_model_state_dict() and save_pretrained() do not include that trained bias. Reloading the adapter therefore does not reproduce the trained model output.
Minimal reproducer:
import os
import tempfile
import torch
from torch import nn
from peft import LoraConfig, get_peft_model, PeftModel
from peft.utils import get_peft_model_state_dict
from safetensors.torch import load_file
class Toy(nn.Module):
def __init__(self):
super().__init__()
self.proj = nn.Linear(4, 3, bias=True)
def forward(self, x):
return self.proj(x)
x = torch.randn(2, 4)
def run(bias):
torch.manual_seed(123)
model = get_peft_model(
Toy(),
LoraConfig(
r=2,
lora_alpha=2,
target_modules=["proj"],
bias=bias,
),
)
# Simulate training by changing all trainable tensors.
torch.manual_seed(456)
with torch.no_grad():
for name, param in model.named_parameters():
if param.requires_grad:
param.add_(torch.randn_like(param) * 0.5)
trainable = [name for name, param in model.named_parameters() if param.requires_grad]
with torch.no_grad():
ref = model(x).detach().clone()
state_dict = get_peft_model_state_dict(model)
state_bias_keys = [key for key in state_dict if "bias" in key]
with tempfile.TemporaryDirectory() as tmpdir:
model.save_pretrained(tmpdir, safe_serialization=True)
saved = load_file(os.path.join(tmpdir, "adapter_model.safetensors"))
saved_bias_keys = [key for key in saved if "bias" in key]
torch.manual_seed(123)
reloaded = PeftModel.from_pretrained(Toy(), tmpdir, is_trainable=False)
with torch.no_grad():
diff = (reloaded(x) - ref).abs().max().item()
print(f"bias={bias!r}")
print(" trainable:", trainable)
print(" get_peft_model_state_dict bias keys:", state_bias_keys)
print(" saved bias keys:", saved_bias_keys)
print(f" roundtrip max diff: {diff:.6f}")
for bias in ["none", "lora_only", "all"]:
run(bias)
Observed output:
bias='none'
trainable: ['base_model.model.proj.lora_A.default.weight', 'base_model.model.proj.lora_B.default.weight']
get_peft_model_state_dict bias keys: []
saved bias keys: []
roundtrip max diff: 0.000000
bias='lora_only'
trainable: ['base_model.model.proj.base_layer.bias', 'base_model.model.proj.lora_A.default.weight', 'base_model.model.proj.lora_B.default.weight']
get_peft_model_state_dict bias keys: []
saved bias keys: []
roundtrip max diff: 1.215291
bias='all'
trainable: ['base_model.model.proj.base_layer.bias', 'base_model.model.proj.lora_A.default.weight', 'base_model.model.proj.lora_B.default.weight']
get_peft_model_state_dict bias keys: ['base_model.model.proj.base_layer.bias']
saved bias keys: ['base_model.model.proj.base_layer.bias']
roundtrip max diff: 0.000000
The likely cause seems to be this logic in peft/utils/save_and_load.py:
bias_name = k.split("lora_")[0] + "bias"
For a key such as:
base_model.model.proj.lora_A.default.weight
this constructs:
base_model.model.proj.bias
but the actual trained bias key is:
base_model.model.proj.base_layer.bias
so the bias is never included in the adapter state dict.
Expected behavior
When LoraConfig(bias="lora_only") marks a wrapped layer's base_layer.bias as trainable, that bias should be included by get_peft_model_state_dict() and saved by save_pretrained().
Reloading the saved adapter with PeftModel.from_pretrained() should reproduce the original PEFT model output, as it does with bias="all".
Alternatively, if exporting bias="lora_only" is not intended to be supported for the current tuner-layer structure, PEFT should warn or raise instead of silently dropping trained parameters.
System Info
Reproduced with:
I also reproduced this against PEFT main from source:
Who can help?
@BenjaminBossan @githubnemo
Reproduction
LoraConfig(bias="lora_only")correctly marks the wrapped base layer bias as trainable, butget_peft_model_state_dict()andsave_pretrained()do not include that trained bias. Reloading the adapter therefore does not reproduce the trained model output.Minimal reproducer:
Observed output:
The likely cause seems to be this logic in
peft/utils/save_and_load.py:For a key such as:
this constructs:
but the actual trained bias key is:
so the bias is never included in the adapter state dict.
Expected behavior
When
LoraConfig(bias="lora_only")marks a wrapped layer'sbase_layer.biasas trainable, that bias should be included byget_peft_model_state_dict()and saved bysave_pretrained().Reloading the saved adapter with
PeftModel.from_pretrained()should reproduce the original PEFT model output, as it does withbias="all".Alternatively, if exporting
bias="lora_only"is not intended to be supported for the current tuner-layer structure, PEFT should warn or raise instead of silently dropping trained parameters.