LoraConfig(bias="lora_only") trains base_layer.bias but save_pretrained/get_peft_model_state_dict drops it

### System Info

Reproduced with:

- peft: 0.19.1
- accelerate: 1.13.0
- transformers: 5.10.2
- torch: 2.12.0+cu132
- safetensors: 0.7.0
- Python: 3.10.20
- Platform: Linux / WSL2, x86_64
- CUDA available: yes, CUDA 13.2

I also reproduced this against PEFT main from source:
- commit: aa2b673629adf5c2340ea835604da73bc747212a
- version reported: 0.19.2.dev0

### Who can help?

@BenjaminBossan  @githubnemo 

### Reproduction

`LoraConfig(bias="lora_only")` correctly marks the wrapped base layer bias as trainable, but `get_peft_model_state_dict()` and `save_pretrained()` do not include that trained bias. Reloading the adapter therefore does not reproduce the trained model output.

Minimal reproducer:

```python
import os
import tempfile

import torch
from torch import nn
from peft import LoraConfig, get_peft_model, PeftModel
from peft.utils import get_peft_model_state_dict
from safetensors.torch import load_file


class Toy(nn.Module):
    def __init__(self):
        super().__init__()
        self.proj = nn.Linear(4, 3, bias=True)

    def forward(self, x):
        return self.proj(x)


x = torch.randn(2, 4)


def run(bias):
    torch.manual_seed(123)
    model = get_peft_model(
        Toy(),
        LoraConfig(
            r=2,
            lora_alpha=2,
            target_modules=["proj"],
            bias=bias,
        ),
    )

    # Simulate training by changing all trainable tensors.
    torch.manual_seed(456)
    with torch.no_grad():
        for name, param in model.named_parameters():
            if param.requires_grad:
                param.add_(torch.randn_like(param) * 0.5)

    trainable = [name for name, param in model.named_parameters() if param.requires_grad]

    with torch.no_grad():
        ref = model(x).detach().clone()

    state_dict = get_peft_model_state_dict(model)
    state_bias_keys = [key for key in state_dict if "bias" in key]

    with tempfile.TemporaryDirectory() as tmpdir:
        model.save_pretrained(tmpdir, safe_serialization=True)

        saved = load_file(os.path.join(tmpdir, "adapter_model.safetensors"))
        saved_bias_keys = [key for key in saved if "bias" in key]

        torch.manual_seed(123)
        reloaded = PeftModel.from_pretrained(Toy(), tmpdir, is_trainable=False)

        with torch.no_grad():
            diff = (reloaded(x) - ref).abs().max().item()

    print(f"bias={bias!r}")
    print("  trainable:", trainable)
    print("  get_peft_model_state_dict bias keys:", state_bias_keys)
    print("  saved bias keys:", saved_bias_keys)
    print(f"  roundtrip max diff: {diff:.6f}")


for bias in ["none", "lora_only", "all"]:
    run(bias)
```

Observed output:

```text
bias='none'
  trainable: ['base_model.model.proj.lora_A.default.weight', 'base_model.model.proj.lora_B.default.weight']
  get_peft_model_state_dict bias keys: []
  saved bias keys: []
  roundtrip max diff: 0.000000

bias='lora_only'
  trainable: ['base_model.model.proj.base_layer.bias', 'base_model.model.proj.lora_A.default.weight', 'base_model.model.proj.lora_B.default.weight']
  get_peft_model_state_dict bias keys: []
  saved bias keys: []
  roundtrip max diff: 1.215291

bias='all'
  trainable: ['base_model.model.proj.base_layer.bias', 'base_model.model.proj.lora_A.default.weight', 'base_model.model.proj.lora_B.default.weight']
  get_peft_model_state_dict bias keys: ['base_model.model.proj.base_layer.bias']
  saved bias keys: ['base_model.model.proj.base_layer.bias']
  roundtrip max diff: 0.000000
```

The likely cause seems to be this logic in `peft/utils/save_and_load.py`:

```python
bias_name = k.split("lora_")[0] + "bias"
```

For a key such as:

```text
base_model.model.proj.lora_A.default.weight
```

this constructs:

```text
base_model.model.proj.bias
```

but the actual trained bias key is:

```text
base_model.model.proj.base_layer.bias
```

so the bias is never included in the adapter state dict.

### Expected behavior

When `LoraConfig(bias="lora_only")` marks a wrapped layer's `base_layer.bias` as trainable, that bias should be included by `get_peft_model_state_dict()` and saved by `save_pretrained()`.

Reloading the saved adapter with `PeftModel.from_pretrained()` should reproduce the original PEFT model output, as it does with `bias="all"`.

Alternatively, if exporting `bias="lora_only"` is not intended to be supported for the current tuner-layer structure, PEFT should warn or raise instead of silently dropping trained parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoraConfig(bias="lora_only") trains base_layer.bias but save_pretrained/get_peft_model_state_dict drops it #3306

System Info

Who can help?

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

LoraConfig(bias="lora_only") trains base_layer.bias but save_pretrained/get_peft_model_state_dict drops it #3306

Description

System Info

Who can help?

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions