-
Notifications
You must be signed in to change notification settings - Fork 195
Module graceful shutdown support #667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Module graceful shutdown support #667
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
sonic-chassisd/scripts/chassisd
Outdated
| module.set_module_state_transition(v2, module_name, "startup") | ||
| except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is the module state transition from here cleared?
sonic-chassisd/scripts/chassisd
Outdated
| wants_up = (admin_state != 'down') | ||
| not_online = (str(operational_state).lower() | ||
| != str(ModuleBase.MODULE_STATUS_ONLINE).lower()) | ||
| if wants_up and not_online and v2: | ||
| try: | ||
| module.set_module_state_transition(v2, module_name, "startup") | ||
| self.log_info(f"Marked startup transition for {module_name} at boot") | ||
| except Exception as e: | ||
| self.log_error(f"Failed to set startup transition for {module_name}: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this section, this function was only present to handle the case were the CONFIG_DB entry is not present, in which case dark mode is implied and we power off the DPUs
sonic-chassisd/scripts/chassisd
Outdated
| ModuleTransitionFlagHelper().set_transition_flag(module_name) | ||
| try_get(self.module_updater.chassis.get_module(module_index).set_admin_state, admin_state, default=False) | ||
| # Only run pre-shutdown on DOWN path | ||
| try_get(module.module_pre_shutdown, default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module_pre_shutdown is supposed to be called if we need to power off the DPU, or if we need to stay in dark mode, as the exisitng code was present in the same format, please align
sonic-chassisd/scripts/chassisd
Outdated
| # Clear transition flag in STATE_DB via ModuleBase centralized API | ||
| try: | ||
| module_obj = self.chassis.get_module(module_index) | ||
| module_obj.clear_module_state_transition(self._state_v2, key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rameshraghupathy Clearing on operational state change is a problem, this can be intentional or unintentional, moreover, there is a transition from online to offline to online when we do a reboot, this will clear the transition during that state change, which is not the expected behavior
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
dfad0fc to
5b2c0f6
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
… format, swsscommon for DB connection etc
…hy/sonic-platform-daemons into graceful-shutdown # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| self.log_warning(f"Failed to set shutdown transition for {key}") | ||
| except Exception as e: | ||
| self.log_error(f"Failed to set shutdown transition for {key}: {e}") | ||
| result = try_get(module.set_admin_state, admin_state, default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we setting state transition again at line 267 and shouldn't we invoke set_admin_state_using_graceful_handler here?
If we are setting state transition in this code path where caller takes care of, we should be avoiding the set/clear in set_admin_state_using_graceful_handler(). also, we need to set the state before pre-shutdown to handle failure scenario
| self.log_warning(f"Failed to set startup transition for {key}") | ||
| except Exception as e: | ||
| self.log_error(f"Failed to set startup transition for {key}: {e}") | ||
| result = try_get(module.set_admin_state, admin_state, default=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above.
Provide support for SmartSwitch DPU module graceful shutdown.
Description:
Single source of truth for transitions
All components now use sonic_platform_base.module_base.ModuleBase helpers:
set_module_state_transition(db, name, transition_type)
clear_module_state_transition(db, name)
get_module_state_transition(db, name) -> dict
is_module_state_transition_timed_out(db, name, timeout_secs) -> bool
Eliminates duplicated logic and race-prone direct Redis writes.
Correct table everywhere
Standardized on CHASSIS_MODULE_TABLE (replaces CHASSIS_MODULE_INFO_TABLE).
HLD mismatch addressed in code (HLD fix tracked separately).
Ownership & lifecycle
The initiator of an operation (startup/shutdown/reboot) sets:
state_transition_in_progress=True
transition_type=<op>
transition_start_time=<utc-iso8601>
The platform (set_admin_state()) is responsible for clearing:
state_transition_in_progress=False
optionally transition_end_time=<epoch> (or similar end stamp).
CLI pre-clears only when a prior transition is timed out.
Timeouts & policy
Platform JSON path only: /usr/share/sonic/device/{plat}/platform.json; else constants.
Typical production values used:
startup: 180s, shutdown: 180s (≈ graceful_wait 60s + power 120s), reboot: 120s.
Graceful wait (e.g., waiting for “Graceful shutdown complete”) is a platform policy and implemented inside platform set_admin_state()—not in ModuleBase.
Boot behavior
chassisd on start:
Clears stale flags once (centralized sweep).
Runs set_initial_dpu_admin_state() which marks transitions via ModuleBase before calling platform set_admin_state().
Leaves clearing to the platform or to well-defined status transitions (ONLINE/OFFLINE) where appropriate.
gNOI shutdown daemon
Listens on CHASSIS_MODULE_TABLE and triggers only when:
state_transition_in_progress=True and transition_type=shutdown.
Never clears the flag (ownership stays with the platform).
Bounded RPC timeouts and robust Redis access (swsssdk/swsscommon).
CLI (config chassis modules …)
Uses ModuleBase APIs for all set/get/timeout checks.
If a previous transition is stuck, is_module_state_transition_timed_out() → auto-clear then proceed.
Sets transition at the start of startup/shutdown; platform clears on completion.
Fabric card flow retained; edits are surgical.
Redis robustness
Helpers handle both stacks (swsssdk/swsscommon); no hset(mapping=...) usage.
Consistent HGETALL/HSET paths; resilient to connector differences.
Race reduction & consistency
Centralized writes prevent multi-writer races.
All transition writes include transition_start_time; clears may add an end stamp.
Existing PCI/file-lock logic left intact; unrelated behavior unchanged.
Change scope
Minimal, targeted diffs.
No background tasks added, no broad refactors beyond transition handling.
Behavior changes are limited to making transition semantics correct and uniform across repos.
HLD: # 1991 sonic-net/SONiC#1991
sonic-platform-common: #567 sonic-net/sonic-platform-common#567
sonic-utilities: sonic-net/sonic-utilities#4031
sonic-platform-daemons: sonic-net/sonic-platform-daemons#667
How to verify it
Issue the "config chassis modules shutdown DPUx" command
Verify the DPU module is gracefully shut by checking the logs in /var/log/syslog on both NPU and DPU
Provide support for SmartSwitch DPU module graceful shutdown.
# Description:
* **Single source of truth for transitions**
* All components now use `sonic_platform_base.module_base.ModuleBase` helpers:
* `set_module_state_transition(db, name, transition_type)`
* `clear_module_state_transition(db, name)`
* `get_module_state_transition(db, name) -> dict`
* `is_module_state_transition_timed_out(db, name, timeout_secs) -> bool`
* Eliminates duplicated logic and race-prone direct Redis writes.
* **Correct table everywhere**
* Standardized on **`CHASSIS_MODULE_TABLE`** (replaces `CHASSIS_MODULE_INFO_TABLE`).
* HLD mismatch addressed in code (HLD fix tracked separately).
* **Ownership & lifecycle**
* The **initiator** of an operation (`startup`/`shutdown`/`reboot`) sets:
* `state_transition_in_progress=True`
* `transition_type=<op>`
* `transition_start_time=<utc-iso8601>`
* The **platform** (`set_admin_state()`) is responsible for clearing:
* `state_transition_in_progress=False`
* optionally `transition_end_time=<epoch>` (or similar end stamp).
* CLI pre-clears only when a prior transition is **timed out**.
* **Timeouts & policy**
* Platform JSON path only: `/usr/share/sonic/device/{plat}/platform.json`; else **constants**.
* Typical production values used:
* `startup: 180s`, `shutdown: 180s` (≈ `graceful_wait 60s + power 120s`), `reboot: 120s`.
* **Graceful wait** (e.g., waiting for “Graceful shutdown complete”) is a **platform policy** and implemented inside platform `set_admin_state()`—not in ModuleBase.
* **Boot behavior**
* `chassisd` on start:
1. **Clears stale flags once** (centralized sweep).
2. Runs `set_initial_dpu_admin_state()` which **marks transitions** via ModuleBase before calling platform `set_admin_state()`.
3. Leaves clearing to the platform or to well-defined status transitions (ONLINE/OFFLINE) where appropriate.
* **gNOI shutdown daemon**
* Listens on **`CHASSIS_MODULE_TABLE`** and triggers only when:
* `state_transition_in_progress=True` **and** `transition_type=shutdown`.
* Never clears the flag (ownership stays with the platform).
* Bounded RPC timeouts and robust Redis access (swsssdk/swsscommon).
* **CLI (`config chassis modules …`)**
* Uses ModuleBase APIs for all set/get/timeout checks.
* If a previous transition is stuck, `is_module_state_transition_timed_out()` → auto-clear then proceed.
* Sets transition at the start of `startup`/`shutdown`; platform clears on completion.
* Fabric card flow retained; edits are surgical.
* **Redis robustness**
* Helpers handle both stacks (swsssdk/swsscommon); no `hset(mapping=...)` usage.
* Consistent HGETALL/HSET paths; resilient to connector differences.
* **Race reduction & consistency**
* Centralized writes prevent multi-writer races.
* All transition writes include `transition_start_time`; clears may add an end stamp.
* Existing PCI/file-lock logic left intact; unrelated behavior unchanged.
* **Change scope**
* Minimal, targeted diffs.
* No background tasks added, no broad refactors beyond transition handling.
* Behavior changes are limited to making transition semantics correct and uniform across repos.
HLD: # 1991 sonic-net/SONiC#1991
sonic-platform-common: #567 sonic-net/sonic-platform-common#567
sonic-utilities: sonic-net/sonic-utilities#4031
sonic-platform-daemons: sonic-net/sonic-platform-daemons#667
How to verify it
Issue the "config chassis modules shutdown DPUx" command
Verify the DPU module is gracefully shut by checking the logs in /var/log/syslog on both NPU and DPU
…daemons
<!-- Provide a general summary of your changes in the Title above -->
#### Description
<!--
Describe your changes in detail
-->
HLD: https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/graceful-shutdown/graceful-shutdown.md
These changes build upon enhancements in [`sonic-platform-daemons#667`](sonic-net#667)
This PR introduces **graceful shutdown and startup orchestration** across SONiC platform daemons to ensure safe DPU and peripheral module transitions during reboot or administrative state changes.
Key updates include:
- Integration of `ModuleBase` lifecycle methods (`module_pre_shutdown`, `module_post_startup`, and `set_admin_state_gracefully`) into platform daemons.
- Move graceful handling of PCIe detach/reattach and sensor reload sequences into set_admin_state_gracefully.
- State tracking in `CHASSIS_MODULE_TABLE` via `STATE_DB` to synchronize transition state across processes.
- File-based operation locks to prevent concurrent access to shared hardware resources.
#### Motivation and Context
<!--
Why is this change required? What problem does it solve?
If this pull request closes/resolves an open Issue, make sure you
include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
Platform daemons currently perform shutdown and startup independently, leading to:
- Race conditions during DPU detachment.
- Inconsistent Redis state across PMON daemons.
- Uncoordinated sensor and PCIe transitions during reboot.
This change introduces a unified **graceful shutdown framework** for SmartSwitch modules.
It ensures predictable module transitions, preserves hardware health, and supports orchestrated restarts without transient hardware errors.
#### How Has This Been Tested?
<!--
Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc.
-->
Testing performed on both **DPU-enabled (SmartSwitch)**.
**Functional validation**
- Verified end-to-end reboot flow with DPU detach/reattach sequence.
- PCIe state (`detaching/attaching`) reflected in `STATE_DB`.
- `pcied` daemon logs confirm ordered detach before reboot and reattach after startup.
- Confirmed no stale Redis entries or orphaned locks post-reboot.
**Unit tests executed**
- tests/test_DaemonPcied.py
- tests/test_chassisd_graceful.py
Coverage includes:
- Transition flag handling
- Timeout behavior
- DB write/read operations
- Graceful admin state flow
**Manual validation**
#### Additional Information (Optional)
Provide support for SmartSwitch DPU module graceful shutdown.
# Description:
* **Single source of truth for transitions**
failure_prs.log skip_prs.log All components now use `sonic_platform_base.module_base.ModuleBase` helpers:
failure_prs.log skip_prs.log `set_module_state_transition(db, name, transition_type)`
failure_prs.log skip_prs.log `clear_module_state_transition(db, name)`
failure_prs.log skip_prs.log `get_module_state_transition(db, name) -> dict`
failure_prs.log skip_prs.log `is_module_state_transition_timed_out(db, name, timeout_secs) -> bool`
failure_prs.log skip_prs.log Eliminates duplicated logic and race-prone direct Redis writes.
* **Correct table everywhere**
failure_prs.log skip_prs.log Standardized on **`CHASSIS_MODULE_TABLE`** (replaces `CHASSIS_MODULE_INFO_TABLE`).
failure_prs.log skip_prs.log HLD mismatch addressed in code (HLD fix tracked separately).
* **Ownership & lifecycle**
failure_prs.log skip_prs.log The **initiator** of an operation (`startup`/`shutdown`/`reboot`) sets:
failure_prs.log skip_prs.log `state_transition_in_progress=True`
failure_prs.log skip_prs.log `transition_type=<op>`
failure_prs.log skip_prs.log `transition_start_time=<utc-iso8601>`
failure_prs.log skip_prs.log The **platform** (`set_admin_state()`) is responsible for clearing:
failure_prs.log skip_prs.log `state_transition_in_progress=False`
failure_prs.log skip_prs.log optionally `transition_end_time=<epoch>` (or similar end stamp).
failure_prs.log skip_prs.log CLI pre-clears only when a prior transition is **timed out**.
* **Timeouts & policy**
failure_prs.log skip_prs.log Platform JSON path only: `/usr/share/sonic/device/{plat}/platform.json`; else **constants**.
failure_prs.log skip_prs.log Typical production values used:
failure_prs.log skip_prs.log `startup: 180s`, `shutdown: 180s` (≈ `graceful_wait 60s + power 120s`), `reboot: 120s`.
failure_prs.log skip_prs.log **Graceful wait** (e.g., waiting for “Graceful shutdown complete”) is a **platform policy** and implemented inside platform `set_admin_state()`—not in ModuleBase.
* **Boot behavior**
failure_prs.log skip_prs.log `chassisd` on start:
1. **Clears stale flags once** (centralized sweep).
2. Runs `set_initial_dpu_admin_state()` which **marks transitions** via ModuleBase before calling platform `set_admin_state()`.
3. Leaves clearing to the platform or to well-defined status transitions (ONLINE/OFFLINE) where appropriate.
* **gNOI shutdown daemon**
failure_prs.log skip_prs.log Listens on **`CHASSIS_MODULE_TABLE`** and triggers only when:
failure_prs.log skip_prs.log `state_transition_in_progress=True` **and** `transition_type=shutdown`.
failure_prs.log skip_prs.log Never clears the flag (ownership stays with the platform).
failure_prs.log skip_prs.log Bounded RPC timeouts and robust Redis access (swsssdk/swsscommon).
* **CLI (`config chassis modules …`)**
failure_prs.log skip_prs.log Uses ModuleBase APIs for all set/get/timeout checks.
failure_prs.log skip_prs.log If a previous transition is stuck, `is_module_state_transition_timed_out()` → auto-clear then proceed.
failure_prs.log skip_prs.log Sets transition at the start of `startup`/`shutdown`; platform clears on completion.
failure_prs.log skip_prs.log Fabric card flow retained; edits are surgical.
* **Redis robustness**
failure_prs.log skip_prs.log Helpers handle both stacks (swsssdk/swsscommon); no `hset(mapping=...)` usage.
failure_prs.log skip_prs.log Consistent HGETALL/HSET paths; resilient to connector differences.
* **Race reduction & consistency**
failure_prs.log skip_prs.log Centralized writes prevent multi-writer races.
failure_prs.log skip_prs.log All transition writes include `transition_start_time`; clears may add an end stamp.
failure_prs.log skip_prs.log Existing PCI/file-lock logic left intact; unrelated behavior unchanged.
* **Change scope**
failure_prs.log skip_prs.log Minimal, targeted diffs.
failure_prs.log skip_prs.log No background tasks added, no broad refactors beyond transition handling.
failure_prs.log skip_prs.log Behavior changes are limited to making transition semantics correct and uniform across repos.
HLD: # 1991 sonic-net/SONiC#1991
sonic-platform-common: #567 sonic-net/sonic-platform-common#567
sonic-utilities: sonic-net/sonic-utilities#4031
sonic-platform-daemons: sonic-net/sonic-platform-daemons#667
How to verify it
Issue the "config chassis modules shutdown DPUx" command
Verify the DPU module is gracefully shut by checking the logs in /var/log/syslog on both NPU and DPU
Provide support for SmartSwitch DPU module graceful shutdown.
Description
Single source of truth for transitions
All components now use
sonic_platform_base.module_base.ModuleBasehelpers:set_module_state_transition(db, name, transition_type)clear_module_state_transition(db, name)get_module_state_transition(db, name) -> dictis_module_state_transition_timed_out(db, name, timeout_secs) -> boolEliminates duplicated logic and race-prone direct Redis writes.
Correct table everywhere
CHASSIS_MODULE_TABLE(replacesCHASSIS_MODULE_INFO_TABLE).Ownership & lifecycle
The initiator of an operation (
startup/shutdown/reboot) sets:state_transition_in_progress=Truetransition_type=<op>transition_start_time=<utc-iso8601>The platform (
set_admin_state()) is responsible for clearing:state_transition_in_progress=Falsetransition_end_time=<epoch>(or similar end stamp).CLI pre-clears only when a prior transition is timed out.
Timeouts & policy
Platform JSON path only:
/usr/share/sonic/device/{plat}/platform.json; else constants.Typical production values used:
startup: 180s,shutdown: 180s(≈graceful_wait 60s + power 120s),reboot: 120s.Graceful wait (e.g., waiting for “Graceful shutdown complete”) is a platform policy and implemented inside platform
set_admin_state()—not in ModuleBase.Boot behavior
chassisdon start:set_initial_dpu_admin_state()which marks transitions via ModuleBase before calling platformset_admin_state().gNOI shutdown daemon
Listens on
CHASSIS_MODULE_TABLEand triggers only when:state_transition_in_progress=Trueandtransition_type=shutdown.Never clears the flag (ownership stays with the platform).
Bounded RPC timeouts and robust Redis access (swsssdk/swsscommon).
CLI (
config chassis modules …)is_module_state_transition_timed_out()→ auto-clear then proceed.startup/shutdown; platform clears on completion.Redis robustness
hset(mapping=...)usage.Race reduction & consistency
transition_start_time; clears may add an end stamp.Change scope
Please refer to the HLD and related PRs:
HLD: # 1991 sonic-net/SONiC#1991
sonic-host-services: sonic-net/sonic-host-services#255
sonic-platform-common: sonic-net/sonic-platform-common#567
Module graceful shutdown support #4031 sonic-net/sonic-utilities#4031
How Has This Been Tested?
Issue the "config chassis modules shutdown DPUx" command
Verify the DPU module is gracefully shut by checking the logs in /var/log/syslog on both NPU and DPU