-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Description:
The FEO framework’s current startup sequence is not fully robust, as the primary agent lacks a timeout mechanism for the initial connection phase. Without this, if a worker or recorder fails to connect, the scheduler will wait indefinitely, causing the entire system to hang and preventing it from starting.
Proposal:
A proposed connection timeout mechanism has been implemented as a contribution to be reviewed by the FEO team. This feature incorporates logic to ensure a robust system startup by preventing the primary agent from waiting indefinitely for all workers and recorders to establish their initial connection. If any component fails to connect within the specified duration, the system will time out and terminate gracefully, preventing a hung state. The design flow for this timeout sequence, which has been verified for both relayed and direct signalling modes, is detailed in the following sequence diagrams.
Covered Requirements:
Secondary connection timeout.
This document specifies the complete set of requirements for the FEO requirements.
https://eclipse-score.github.io/score/main/modules/feo/feo/docs/requirements/component_requirements.html