Skip to content

Conversation

@littlebullGit
Copy link
Contributor

@littlebullGit littlebullGit commented Nov 27, 2025

What does this PR do?

  • Adds a class-level fallback allow_zero_length_dataloader_with_multiple_devices so custom [LightningDataModule] subclasses remain compatible even if they skip super().__init__().
  • Adds regression coverage ensuring Trainer loops can validate with such datamodules.
  • Updates the changelog with the fix summary.

Fixes #21358

Before submitting
  • Was this discussed/agreed via a GitHub issue? ✅ (Missing property 'allow_zero_length_dataloader_with_multiple_devices' of LightningDataModule #21358)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (not needed)
  • Did you write any new necessary tests? (new regression test)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request? (none)
  • Did you update the CHANGELOG? (added “Fixed” entry)

📚 Documentation preview 📚: https://pytorch-lightning--21383.org.readthedocs.build/en/21383/

@github-actions github-actions bot added pl Generic label for PyTorch Lightning package has conflicts labels Nov 27, 2025
@codecov
Copy link

codecov bot commented Nov 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82%. Comparing base (b09e96e) to head (300c41e).
⚠️ Report is 2 commits behind head on master.
✅ All tests successful. No failed tests found.

❗ There is a different number of reports uploaded between BASE (b09e96e) and HEAD (300c41e). Click for more details.

HEAD has 642 uploads less than BASE
Flag BASE (b09e96e) HEAD (300c41e)
cpu 176 30
lightning_fabric 44 0
pytest 88 0
python3.12 53 9
python3.12.7 52 9
lightning 87 15
python3.11 36 6
python3.10 17 3
python 18 3
pytorch_lightning 45 15
pytorch2.7 9 3
pytest-full 88 30
pytorch2.2.2 9 3
pytorch2.6 9 3
pytorch2.4.1 8 3
pytorch2.5.1 8 3
pytorch2.8 9 3
pytorch2.1 18 6
pytorch2.3 9 3
pytorch2.9 9 3
Additional details and impacted files
@@            Coverage Diff            @@
##           master   #21383     +/-   ##
=========================================
- Coverage      89%      82%     -7%     
=========================================
  Files         269      266      -3     
  Lines       22063    22009     -54     
=========================================
- Hits        19737    18072   -1665     
- Misses       2326     3937   +1611     

Copy link
Collaborator

@SkafteNicki SkafteNicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really necessary? Having super().__init__() in the __init__ of child classes is a requirement

class MyDataModule(L.LightningDataModule):
    def __init__(self):
        super().__init__()  # ← REQUIRED!
        # ... rest of initialization

@littlebullGit
Copy link
Contributor Author

Is this really necessary? Having super().__init__() in the __init__ of child classes is a requirement

class MyDataModule(L.LightningDataModule):
    def __init__(self):
        super().__init__()  # ← REQUIRED!
        # ... rest of initialization

@SkafteNicki

You’re right that calling super().__init__() in a LightningDataModule is the recommended pattern, and we should keep encouraging it (it’s what we show in the docs and examples). But this fix is still needed for a few reasons:

  • Not enforced today
    The framework has never required super().__init__(). If a user skips it, nothing fails at instantiation. Their DataModule builds fine and only breaks later when the Trainer expects certain attributes to exist.

  • How the bug shows up
    The error appears deep in evaluation_loop.py at training/validation time as
    AttributeError: 'MyDataModule' object has no attribute 'allow_zero_length_dataloader_with_multiple_devices'
    with no clear hint that the missing super().__init__() call is the real problem. This is exactly how the bug was reported.

  • Backwards compatibility
    Before v2.5.0, users could skip super().__init__() (on purpose or by accident) and things still worked. Once allow_zero_length_dataloader_with_multiple_devices started being used in the Trainer loops, that previously-working code began to break.

  • Consistency with LightningModule
    LightningModule already defines similar attributes at the class level as defensive defaults. Doing the same in LightningDataModule keeps behavior consistent and avoids surprising AttributeErrors.

  • Small, safe change
    Adding a class-level default doesn’t affect users who do call super().__init__() correctly. It only gives a safe fallback for older subclasses.

We could instead add a strict runtime check that says “You must call super().__init__() in your LightningDataModule subclass.” But that would still break existing code. The class-level fallback keeps old behavior working, avoids confusing deep-stack errors, and still lets us recommend super().__init__() as the preferred pattern going forward.

@SkafteNicki
Copy link
Collaborator

@littlebullGit while I can agree with some of your points, some of them are also not true:

  • For backwards compatibility, there is another attribute that is initialized in the DataHooks.__init__ method
    self.prepare_data_per_node: bool = True

    which will also lead to problems down the line if super().__init__() is not called. So talking about backwards compatibility does not make sense, because it has been an anti pattern for as long as datamodule has existed.
  • For consistency with LightningModule, if you again dig into some of the parent classes such as as DataHooks, Module and _DeviceDtypeModuleMixin you will see the same problem: if super().__init__() is not called then some attibutes are not defined and this will lead to problems somewhere in the code therefore it is a requirement!

This PR only introduces a small change therefore I am somewhat okay with approving it. That said, this is not a bug but it is an user error. The real solution is either enforcing that super().__init__() is called or that our documentation is even more clear on this.

@littlebullGit
Copy link
Contributor Author

@littlebullGit while I can agree with some of your points, some of them are also not true:

  • For backwards compatibility, there is another attribute that is initialized in the DataHooks.__init__ method

    self.prepare_data_per_node: bool = True

    which will also lead to problems down the line if super().__init__() is not called. So talking about backwards compatibility does not make sense, because it has been an anti pattern for as long as datamodule has existed.

  • For consistency with LightningModule, if you again dig into some of the parent classes such as as DataHooks, Module and _DeviceDtypeModuleMixin you will see the same problem: if super().__init__() is not called then some attibutes are not defined and this will lead to problems somewhere in the code therefore it is a requirement!

This PR only introduces a small change therefore I am somewhat okay with approving it. That said, this is not a bug but it is an user error. The real solution is either enforcing that super().__init__() is called or that our documentation is even more clear on this.

Agree with you that this is likely an user error. I also commented on the Issue.
#21358 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pl Generic label for PyTorch Lightning package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing property 'allow_zero_length_dataloader_with_multiple_devices' of LightningDataModule

2 participants