Fix LightningDataModule zero-length attribute #21383

littlebullGit · 2025-11-27T00:01:16Z

What does this PR do?

Adds a class-level fallback allow_zero_length_dataloader_with_multiple_devices so custom [LightningDataModule] subclasses remain compatible even if they skip super().__init__().
Adds regression coverage ensuring Trainer loops can validate with such datamodules.
Updates the changelog with the fix summary.

Fixes #21358

Before submitting

Was this discussed/agreed via a GitHub issue? ✅ (Missing property 'allow_zero_length_dataloader_with_multiple_devices' of LightningDataModule #21358)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (not needed)
Did you write any new necessary tests? (new regression test)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request? (none)
Did you update the CHANGELOG? (added “Fixed” entry)

📚 Documentation preview 📚: https://pytorch-lightning--21383.org.readthedocs.build/en/21383/

codecov · 2025-11-28T09:35:27Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82%. Comparing base (b09e96e) to head (300c41e).
⚠️ Report is 2 commits behind head on master.
✅ All tests successful. No failed tests found.

❗ There is a different number of reports uploaded between BASE (b09e96e) and HEAD (300c41e). Click for more details.

HEAD has 642 uploads less than BASE

Flag BASE (b09e96e) HEAD (300c41e)

cpu 176 30

lightning_fabric 44 0

pytest 88 0

python3.12 53 9

python3.12.7 52 9

lightning 87 15

python3.11 36 6

python3.10 17 3

python 18 3

pytorch_lightning 45 15

pytorch2.7 9 3

pytest-full 88 30

pytorch2.2.2 9 3

pytorch2.6 9 3

pytorch2.4.1 8 3

pytorch2.5.1 8 3

pytorch2.8 9 3

pytorch2.1 18 6

pytorch2.3 9 3

pytorch2.9 9 3

Additional details and impacted files

@@            Coverage Diff            @@
##           master   #21383     +/-   ##
=========================================
- Coverage      89%      82%     -7%     
=========================================
  Files         269      266      -3     
  Lines       22063    22009     -54     
=========================================
- Hits        19737    18072   -1665     
- Misses       2326     3937   +1611

SkafteNicki

Is this really necessary? Having super().__init__() in the __init__ of child classes is a requirement

class MyDataModule(L.LightningDataModule):
    def __init__(self):
        super().__init__()  # ← REQUIRED!
        # ... rest of initialization

littlebullGit · 2025-12-01T13:59:03Z

Is this really necessary? Having super().__init__() in the __init__ of child classes is a requirement
class MyDataModule(L.LightningDataModule):
    def __init__(self):
        super().__init__()  # ← REQUIRED!
        # ... rest of initialization

@SkafteNicki

You’re right that calling super().__init__() in a LightningDataModule is the recommended pattern, and we should keep encouraging it (it’s what we show in the docs and examples). But this fix is still needed for a few reasons:

Not enforced today
The framework has never required super().__init__(). If a user skips it, nothing fails at instantiation. Their DataModule builds fine and only breaks later when the Trainer expects certain attributes to exist.
How the bug shows up
The error appears deep in evaluation_loop.py at training/validation time as
AttributeError: 'MyDataModule' object has no attribute 'allow_zero_length_dataloader_with_multiple_devices'
with no clear hint that the missing super().__init__() call is the real problem. This is exactly how the bug was reported.
Backwards compatibility
Before v2.5.0, users could skip super().__init__() (on purpose or by accident) and things still worked. Once allow_zero_length_dataloader_with_multiple_devices started being used in the Trainer loops, that previously-working code began to break.
Consistency with LightningModule
LightningModule already defines similar attributes at the class level as defensive defaults. Doing the same in LightningDataModule keeps behavior consistent and avoids surprising AttributeErrors.
Small, safe change
Adding a class-level default doesn’t affect users who do call super().__init__() correctly. It only gives a safe fallback for older subclasses.

We could instead add a strict runtime check that says “You must call super().__init__() in your LightningDataModule subclass.” But that would still break existing code. The class-level fallback keeps old behavior working, avoids confusing deep-stack errors, and still lets us recommend super().__init__() as the preferred pattern going forward.

SkafteNicki · 2025-12-02T08:08:53Z

@littlebullGit while I can agree with some of your points, some of them are also not true:

For backwards compatibility, there is another attribute that is initialized in the DataHooks.__init__ method

pytorch-lightning/src/lightning/pytorch/core/hooks.py

Line 362 in 79ffe50

self.prepare_data_per_node: bool = True

which will also lead to problems down the line if super().__init__() is not called. So talking about backwards compatibility does not make sense, because it has been an anti pattern for as long as datamodule has existed.
For consistency with LightningModule, if you again dig into some of the parent classes such as as DataHooks, Module and _DeviceDtypeModuleMixin you will see the same problem: if super().__init__() is not called then some attibutes are not defined and this will lead to problems somewhere in the code therefore it is a requirement!

This PR only introduces a small change therefore I am somewhat okay with approving it. That said, this is not a bug but it is an user error. The real solution is either enforcing that super().__init__() is called or that our documentation is even more clear on this.

littlebullGit · 2025-12-02T16:00:32Z

@littlebullGit while I can agree with some of your points, some of them are also not true:

For backwards compatibility, there is another attribute that is initialized in the DataHooks.__init__ method

pytorch-lightning/src/lightning/pytorch/core/hooks.py

Line 362 in 79ffe50

self.prepare_data_per_node: bool = True

which will also lead to problems down the line if super().__init__() is not called. So talking about backwards compatibility does not make sense, because it has been an anti pattern for as long as datamodule has existed.

For consistency with LightningModule, if you again dig into some of the parent classes such as as DataHooks, Module and _DeviceDtypeModuleMixin you will see the same problem: if super().__init__() is not called then some attibutes are not defined and this will lead to problems somewhere in the code therefore it is a requirement!

This PR only introduces a small change therefore I am somewhat okay with approving it. That said, this is not a bug but it is an user error. The real solution is either enforcing that super().__init__() is called or that our documentation is even more clear on this.

Agree with you that this is likely an user error. I also commented on the Issue.
#21358 (comment)

Fix LightningDataModule zero-length attribute

300c41e

littlebullGit requested review from ethanwharris, justusschock, lantiga and tchaton as code owners November 27, 2025 00:01

github-actions bot added pl Generic label for PyTorch Lightning package has conflicts labels Nov 27, 2025

Merge branch 'master' into fix/21358-datamodule-zero-length

b94c961

github-actions bot removed the has conflicts label Nov 28, 2025

SkafteNicki requested changes Dec 1, 2025

View reviewed changes

littlebullGit requested a review from SkafteNicki December 1, 2025 17:17

littlebullGit closed this Dec 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix LightningDataModule zero-length attribute #21383

Fix LightningDataModule zero-length attribute #21383

littlebullGit commented Nov 27, 2025 •

edited by github-actions bot

Loading

Uh oh!

codecov bot commented Nov 28, 2025

Uh oh!

SkafteNicki left a comment

Uh oh!

littlebullGit commented Dec 1, 2025

Uh oh!

SkafteNicki commented Dec 2, 2025

Uh oh!

littlebullGit commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix LightningDataModule zero-length attribute #21383

Fix LightningDataModule zero-length attribute #21383

Conversation

littlebullGit commented Nov 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

codecov bot commented Nov 28, 2025

Codecov Report

Uh oh!

SkafteNicki left a comment

Choose a reason for hiding this comment

Uh oh!

littlebullGit commented Dec 1, 2025

Uh oh!

SkafteNicki commented Dec 2, 2025

Uh oh!

littlebullGit commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

littlebullGit commented Nov 27, 2025 •

edited by github-actions bot

Loading