Power management in MCCS
Hardware power modes
SKA hardware may support up to three power modes:
ON: the hardware is powered on and fully operational. This mode is supported by all SKA hardware.
OFF: the hardware is powered off. Generally we would expect all hardware to be able to be turned off. There may be special cases, however, where this is not supported. For example, an externally managed cluster can be turned off, but the MCCS interface to it might only allow for submission and monitoring of jobs. Thus it cannot be turned off from the MCCS point of view.
STANDBY: the hardware is in a low-power standby mode. Such a mode is important in two cases:
where powering up a subsystem with many devices, it is important to limit the inrush current. This is achieved by powering up devices into a standby mode that uses no more than 5% of their nominal power; then carefully orchestrating transitions to full power.
where powering up a device from off could take a long time (perhaps several minutes). Such devices may instead be powered up into standby mode, in which power consumption is low, but the time to fully power on the hardware is short (a couple of seconds).
Standby mode is not supported by all hardware; indeed there may be very few hardware devices that support it.
Power mode breakdown
Generally speaking, one cannot tell a hardware device to turn itself off; for once it is off, it loses the ability to turn itself on again. Instead, power to a device is controlled by some upstream device. For example, power to a TPM is controlled by the subrack in which that TPM is installed. Standby mode is generally not managed by an individual devices, but by an ensambe of devices (SpsStation, FieldStation) by selectively turning off the most power comsumpting devices. Thus, implementation of the three power modes breaks down into:
OFF: tell the upstream device (e.g. subrack) to deny power to the device (e.g. TPM)
STANDBY: selectively turn ON some sub-devices, and turn OFF or leave OFF the TPMs (in the SPS cabinet) and he antennas and Front End modules (in the Field station)
ON: tell the upstream device to supply power to the device, then tell the device itself to go fully operational
Power flow
Map
The activity diagram below shows the flow of power through the MCCS system; i.e. cabling, essentially. The (/) points are switch points at which the power can be turned on/off. These switch points are annotated with the Tango device commands that drive the switch (where impemented).
Note: this diagram will evolve over time.
Startup sequence
Boot-up
When power is first applied to MCCS, the following minimal bootup sequence is followed:
Power is applied to all cabinets. All the cabinet management boards come on, as they are the primary control points for the cabinet subsystems. Switches and subelements for the SPS cabinets are configured to remain off, as are the subelements for all but one of the MCCS cabinets.
Power is applied to the cwFrontNode Data Hubs (FNDH) in the field nodes. All the smartboxes and antennas are configured to remain off.
The cabinet Power Distribution Unit (PDU) for the MCCS cabinet that houses the MCCS controller node is configured to start up the cabinet’s 1Gb network switch and the MCCS controller node.
The 1Gb network switch powers up
The MCCS controller node boots up.
The kubernetes cluster is started.
A minimum chart is deployed, containing just the tango subsystem and the MCCS Controller Tango device.
Power-on
When TM sends the MCCS Controller the Startup command, the MCCS Controller must start up:
the rest of MCCS
the SPS subracks
the SPS TPMs
the field equipment
A possible implementation of the startup sequence, limited to the SPS and Field Station equipment, is shown in the sequence diagram below. Elements in red are not yet implemented. In this example the power up operation is performed in two stages. First the system is placed in STANDBY mode, which can occur in a short time. Then individual elements are progressively turned ON. Power ON is staggered in time to avoid excessive power ramping. Finally, the whole array is synchronised and acquisition is started.
Prototype status
In the current prototype implementation, all of MCCS is deployed immediately on startup, so that when TM sends the MCCS Controller the Startup command, it need only start up the SPS cabinets and field equipment.