Skip to content

Conversation

@smfrench
Copy link
Owner

Replace close_cached_dir() calls under cfid_list_lock with a new close_cached_dir_locked() variant that uses kref_put() instead of kref_put_lock() to avoid recursive locking when dropping references.

While the existing code works if the refcount >= 2 invariant holds, this area has proven error-prone. Make deadlocks impossible and WARN on invariant violations.

Cc: [email protected]

Replace close_cached_dir() calls under cfid_list_lock with a new
close_cached_dir_locked() variant that uses kref_put() instead of
kref_put_lock() to avoid recursive locking when dropping references.

While the existing code works if the refcount >= 2 invariant holds,
this area has proven error-prone. Make deadlocks impossible and WARN
on invariant violations.

Cc: [email protected]
Signed-off-by: Henrique Carvalho <[email protected]>
Signed-off-by: Steve French <[email protected]>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new close_cached_dir_locked() function to safely drop cached directory references while holding the cfid_list_lock spinlock. The change prevents potential recursive locking deadlocks by using kref_put() instead of kref_put_lock() when the lock is already held.

  • Adds close_cached_dir_locked() variant that expects cfid_list_lock to be held and validates refcount >= 2 before decrementing
  • Updates two call sites to use the new locked variant where the spinlock is already held
  • Adds comprehensive documentation and lockdep assertions to prevent misuse

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* The release function will be called with cfid_list_lock held to remove the
* cached dirs from the list before any other thread can take another @cfid
* ref. Must not be called with cfid_list_lock held; use
* close_cached_dir_locked() called instead.
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states "Must not be called with cfid_list_lock held; use close_cached_dir_locked() called instead." The word "called" appears to be redundant or incorrectly placed. It should be either "use close_cached_dir_locked() instead" or "call close_cached_dir_locked() instead".

Suggested change
* close_cached_dir_locked() called instead.
* close_cached_dir_locked() instead.

Copilot uses AI. Check for mistakes.
* two references. If that invariant is violated, WARNs and returns without
* dropping a reference; the final put must still go through
* close_cached_dir().
*
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The documentation could be clearer about the safety mechanism. While it mentions that calling with refcount < 2 would cause a deadlock, it should explicitly state that when refcount >= 2, calling kref_put() is safe because the refcount will remain >= 1 after decrement, ensuring smb2_close_cached_fid() (the release function) is not invoked while holding the lock.

Suggested change
*
*
* When refcount >= 2, calling kref_put() is safe because the refcount will remain
* >= 1 after decrement, ensuring smb2_close_cached_fid() (the release function)
* is not invoked while holding the lock.
*

Copilot uses AI. Check for mistakes.
@smfrench smfrench force-pushed the ksmbd-for-next branch 3 times, most recently from 664a0f9 to 851934d Compare December 3, 2025 22:10
@smfrench smfrench force-pushed the ksmbd-for-next branch 3 times, most recently from 6461f77 to 3d99347 Compare December 9, 2025 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants