Guernsey Press

Switch to back-up server did not happen as it should have

A TRANSFER to a back-up server failed when the main States computer server overheated, it has been confirmed.

Published
Head of public service Mark de Garis. (Picture by Peter Frankland, 31532010)

Services were back online yesterday, after another outage on Thursday.

Policy & Resources president Deputy Peter Ferbrache has been calling for answers after problems began eight days ago.

Details were finally given yesterday.

The initial outage began a week yesterday, when an air conditioning unit failed in the main server room at Frossard House, which contains about 500 servers.

An alarm soon went off when the room reached 25C. By the time IT staff arrived 20 minutes later, the temperature was 44C and climbing.

A third party contractor was called in, by which time it had reached 48C.

If servers get too hot, they stop working and data can be lost.

In this case the servers went into preservation mode, where the system shuts down to protect itself and the data.

At that point services should have switched to back-up equipment room at Edward T Wheadon House.

But that failed.

The cause of this failure is unclear and is under investigation.

Head of public service Mark de Garis said Agilisys engineers and third-party contractors had been working tirelessly over the last eight days to restore services.

‘An outage such as we have experienced over the last week is unacceptable and urgent actions are being taken to fully understand how this happened,’ he said.

‘The States of Guernsey runs a large number of systems supporting a huge range of services. For example we have bespoke systems in areas such as children’s services, the ports, Beau Sejour, social security systems responsible for the payment of benefits and many online services through gov.gg. This makes the recovery of systems extremely complex.’

Mr de Garis thanked staff for their hard work.

‘This however remains a completely unacceptable situation and I want to apologise to all our service users who have been affected by the disruption.

‘We will provide further information as soon as the detailed investigations are complete.’

The live data is replicated across both the server sites and the States also does an incremental back-up of all data every evening, and a weekly full back-up to a third site.

While the States’ websites are now stable, there are still issues with some internal systems linked to individual services.