RDKB-63854: WanManager is not Restarting the DHCPv4 and DHCPv6 client parameters automatically when the parameters are disabled by aadhithan01 · Pull Request #212 · rdkcentral/wan-manager

aadhithan01 · 2026-05-19T15:38:59Z

Summary

WanManager fails to automatically restart the DHCPv6 client (self-heal) when Device.DHCPv6.Client.1.Enable is toggled rapidly. The issue manifests intermittently — the self-heal works on some iterations but fails on others due to a race condition between the state machine thread and the DHCP event queue worker.

Root Cause

In wan_transition_ipv6_down(), there is a race condition:

State machine thread (tid=8844) detects IPv6 is DOWN and enters wan_transition_ipv6_down
DHCP event queue worker (tid=8983) processes the "DHCPv6 client stopped" event which updates Dhcp6cStatus

When the state machine executes before the event worker processes the stop event:

Dhcp6cStatus is still DHCPC_STARTED (stale)
Dhcp6cPid is set to 1 (placeholder assigned by WanManager_StartDhcpv6Client via DHCP Manager)
Since PID 1 is always init/systemd (always alive), WanMgr_IsPIDRunning(1) returns TRUE
The self-heal condition evaluates to FALSE — DHCPv6 client is NOT restarted
IPv6 remains permanently DOWN until manual intervention

Fix

Changed the self-heal condition in wan_transition_ipv6_down() to trigger restart when ANY of:

Dhcp6cStatus != DHCPC_STARTED — event was processed, client confirmed stopped
Dhcp6cPid <= 1 — PID is invalid/placeholder (0 = unset, 1 = init process, not dhcp6c)
WanMgr_IsPIDRunning(Dhcp6cPid) != TRUE — PID > 1 but process is no longer running

Before:

if(p_VirtIf->IP.Dhcp6cStatus != DHCPC_STARTED ||
   (p_VirtIf->IP.Dhcp6cPid > 0 && WanMgr_IsPIDRunning(p_VirtIf->IP.Dhcp6cPid) != TRUE))

After:

if(p_VirtIf->IP.Dhcp6cStatus != DHCPC_STARTED ||
   p_VirtIf->IP.Dhcp6cPid <= 1 ||
   WanMgr_IsPIDRunning(p_VirtIf->IP.Dhcp6cPid) != TRUE)

Testing Steps

SSH to the Docsis-Gateway (TECHNICOLORXB7)

Tail the WanManager log:

tail -f /rdklogs/logs/WANMANAGERLog.txt.0 &

Rapidly toggle DHCPv6 disable/enable multiple times:

dmcli eRT setv Device.DHCPv6.Client.1.Enable bool false
dmcli eRT setv Device.DHCPv6.Client.1.Enable bool true
dmcli eRT setv Device.DHCPv6.Client.1.Enable bool false
dmcli eRT setv Device.DHCPv6.Client.1.Enable bool true
dmcli eRT setv Device.DHCPv6.Client.1.Enable bool false
dmcli eRT setv Device.DHCPv6.Client.1.Enable bool true

Expected Result: After each disable/enable cycle, the log should show:

wan_transition_ipv6_down XXXX - Starting dibbler-client on interface erouter0 (Dhcp6cStatus=X, Pid=X)
wan_transition_ipv6_down XXXX - SELFHEAL - Started dibbler-client on interface erouter0

Followed by IPv6 recovering to DUAL STACK ACTIVE state.

Verify: IPv6 should recover on ALL iterations (not just some). Check:
```
grep -c "SELFHEAL" /rdklogs/logs/WANMANAGERLog.txt.0
```
Count should match number of disable cycles.
Verify no IPv6 permanent DOWN state:
```
grep "TRANSITION IPV4 LEASED" /rdklogs/logs/WANMANAGERLog.txt.0
```
Each occurrence should be followed by a subsequent "TRANSITION DUAL STACK ACTIVE".

Affected Component

File: source/WanManager/wanmgr_interface_sm.c
Function: wan_transition_ipv6_down()
Platform: TECHNICOLORXB7 (tchxb7), RDKB

Risk: Low

Single condition change in self-heal logic. No functional change when DHCPv6 client has a valid PID > 1 and is running.

Root Cause: Race condition between interface state machine thread and DHCP event queue worker. When wan_transition_ipv6_down() executes before the 'DHCPv6 client stopped' event is processed, Dhcp6cStatus remains DHCPC_STARTED even though dibbler-client process is dead. Self-heal restart is incorrectly skipped, leaving IPv6 permanently DOWN. Fix: In addition to checking Dhcp6cStatus flag, also verify the actual dibbler-client process is still alive via WanMgr_IsPIDRunning(). If the PID is stale (process dead), restart the client regardless of the status flag state. Signed-off-by: aadhithan01 <64963550+aadhithan01@users.noreply.github.com>

Previous fix used (Dhcp6cPid > 0 && !IsPIDRunning(Pid)) but Dhcp6cPid=1 (placeholder set by self-heal via DHCP Manager) means IsPIDRunning(1) always returns TRUE since PID 1 is init. Self-heal skipped on repeat. Now triggers self-heal when ANY of: 1. Dhcp6cStatus != DHCPC_STARTED 2. Dhcp6cPid <= 1 (invalid/placeholder) 3. Dhcp6cPid > 1 but process dead

Copilot

Pull request overview

Fixes a race condition in wan_transition_ipv6_down() where the DHCPv6 self-heal check could be skipped because the "DHCP client stopped" event had not yet been processed by the event queue worker, leaving Dhcp6cStatus stale at DHCPC_STARTED and Dhcp6cPid set to the placeholder value 1 (init). The strengthened condition now also restarts the client when the recorded PID is <= 1 or when the running PID is no longer alive.

Changes:

Expanded the DHCPv6 self-heal condition in wan_transition_ipv6_down() to OR three independent checks (status, PID≤1, PID-not-running).
Added an explanatory block comment describing the race and the three trigger conditions.
Enhanced the "Starting dibbler-client" log to include current Dhcp6cStatus and Pid for debugging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        if(p_VirtIf->IP.Dhcp6cStatus != DHCPC_STARTED ||
+           p_VirtIf->IP.Dhcp6cPid <= 1 ||
+           WanMgr_IsPIDRunning(p_VirtIf->IP.Dhcp6cPid) != TRUE)


S-Parthiban-Selvaraj · 2026-05-21T11:20:53Z

+         * 3. PID > 1 but process is no longer running (confirmed dead)
+         */
+        if(p_VirtIf->IP.Dhcp6cStatus != DHCPC_STARTED ||
+           p_VirtIf->IP.Dhcp6cPid <= 1 ||


Dhcp6cPid shouldn't be used incase of DHCpmanager enabled build. WanManager will not have the info about the PID of the DHCP clients running

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

source/WanManager/wanmgr_interface_sm.c:2935

The PR description says the IPv6-down self-heal should also restart DHCPv6 when the stored PID is a placeholder/invalid (<=1) or when the PID is not running, to avoid the race where Dhcp6cPid is temporarily set to 1. The current condition here still only checks Dhcp6cStatus != DHCPC_STARTED, so the described race (PID=1 treated as running) can still prevent self-heal and leave IPv6 DOWN. Please update this condition to match the fix described in the PR (status != STARTED OR pid <= 1 OR pid not running), and ensure the PID check is safe when PID is 0/1.

        if(p_VirtIf->IP.Dhcp6cStatus != DHCPC_STARTED)
        {
            /* Start DHCPv6 Client */
            CcspTraceInfo(("%s %d - Starting dibbler-client on interface %s \n", __FUNCTION__, __LINE__, p_VirtIf->Name));
            WanManager_StartDhcpv6Client(p_VirtIf, pInterface->IfaceType);
            CcspTraceInfo(("%s %d - SELFHEAL - Started dibbler-client on interface %s\n", __FUNCTION__, __LINE__, p_VirtIf->Name));

+                    if (pVirtIf->IP.Dhcp4cStatus == DHCPC_STARTED)
+                    {
+                        /* Client was stopped externally (not by WanManager).
+                         * WanManager_StopDhcpv4Client sets Dhcp4cStatus=STOPPED
+                         * synchronously before any event arrives, so DHCPC_STARTED
+                         * here means the stop was NOT initiated by WanManager.
+                         * Restart the client to maintain IPv4 connectivity. */
+                        pVirtIf->IP.Dhcp4cStatus = DHCPC_STOPPED;
+                        CcspTraceInfo(("%s-%d : SELFHEAL - Restarting DHCPv4 client for %s\n", __FUNCTION__, __LINE__, pVirtIf->Name));
+                        WanManager_StartDhcpv4Client(pVirtIf, gWanMgrDataBase.IfaceCtrl.pIface[pVirtIf->baseIfIdx].data.Name, gWanMgrDataBase.IfaceCtrl.pIface[pVirtIf->baseIfIdx].data.IfaceType);
+                    }
+                    else
+                    {
+                        pVirtIf->IP.Dhcp4cStatus = DHCPC_STOPPED;
+                    }


+                    CcspTraceInfo(("%s-%d : DHCPv6 client stopped for %s (Dhcp6cStatus=%d)\n", __FUNCTION__, __LINE__, pVirtIf->Name, pVirtIf->IP.Dhcp6cStatus));
+                    if (pVirtIf->IP.Dhcp6cStatus == DHCPC_STARTED)
+                    {
+                        /* Client was stopped externally (not by WanManager).
+                         * WanManager_StopDhcpv6Client sets Dhcp6cStatus=STOPPED
+                         * synchronously before any event arrives, so DHCPC_STARTED
+                         * here means the stop was NOT initiated by WanManager.
+                         * Restart the client to maintain IPv6 connectivity. */
+                        pVirtIf->IP.Dhcp6cStatus = DHCPC_STOPPED;
+                        CcspTraceInfo(("%s-%d : SELFHEAL - Restarting DHCPv6 client for %s\n", __FUNCTION__, __LINE__, pVirtIf->Name));
+                        WanManager_StartDhcpv6Client(pVirtIf, gWanMgrDataBase.IfaceCtrl.pIface[pVirtIf->baseIfIdx].data.IfaceType);
+                    }


In FEATURE_RDKB_DHCP_MANAGER builds, when DHCPManager externally stops the DHCPv6 client, the CLIENT_STOPPED event arrives after LEASE_DEL has already triggered wan_transition_ipv6_down. By the time ISM checks Dhcp6cStatus, it is stale (DHCPC_STARTED) causing the self-heal to skip. Fix: In the DHCP_CLIENT_STOPPED event handler, detect external stops by checking if Dhcp6cStatus is still DHCPC_STARTED (WanManager_StopDhcpv6Client sets it to STOPPED synchronously before events arrive). If external, restart the client immediately. Revert wan_transition_ipv6_down to original logic (secondary self-heal for subsequent cycles remains intact).

aadhithan01 added 2 commits May 19, 2026 17:28

aadhithan01 requested a review from a team as a code owner May 19, 2026 15:39

Copilot AI review requested due to automatic review settings May 19, 2026 15:39

Copilot AI reviewed May 19, 2026

View reviewed changes

Comment thread source/WanManager/wanmgr_interface_sm.c Outdated

Comment on lines +2941 to +2943

if(p_VirtIf->IP.Dhcp6cStatus != DHCPC_STARTED ||

p_VirtIf->IP.Dhcp6cPid <= 1 ||

WanMgr_IsPIDRunning(p_VirtIf->IP.Dhcp6cPid) != TRUE)

S-Parthiban-Selvaraj requested changes May 21, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings May 25, 2026 12:45

aadhithan01 force-pushed the fix/ipv6-selfheal-race-condition branch from 8ddda13 to aee48a0 Compare May 25, 2026 12:45

Copilot started reviewing on behalf of aadhithan01 May 25, 2026 12:46 View session

Copilot AI reviewed May 25, 2026

View reviewed changes

aadhithan01 force-pushed the fix/ipv6-selfheal-race-condition branch from aee48a0 to b4cb180 Compare May 25, 2026 12:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RDKB-63854: WanManager is not Restarting the DHCPv4 and DHCPv6 client parameters automatically when the parameters are disabled#212

RDKB-63854: WanManager is not Restarting the DHCPv4 and DHCPv6 client parameters automatically when the parameters are disabled#212
aadhithan01 wants to merge 3 commits into
rdkcentral:mainfrom
aadhithan01:fix/ipv6-selfheal-race-condition

aadhithan01 commented May 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

S-Parthiban-Selvaraj May 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aadhithan01 commented May 19, 2026

Summary

Root Cause

Fix

Testing Steps

Affected Component

Risk: Low

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

S-Parthiban-Selvaraj May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants