Skip to main content

‘Wrong date’ triggered crash

THE generation of a ‘wrong date’ because of hardware failure was behind disruption to JT’s broadband, mobile and landline services earlier this month.

Graeme Millar, JT. (28517269)
Graeme Millar, JT. (28517269) / Guernsey Press

A report issued by the telecoms provider set out the chain of events that affected its network – starting with two clock sources called network time protocol servers.

‘On July 12th at 18:55 BST, one of the two NTP servers generated a wrong date (actually “27/11/2000”). Because the source clock was available from a service point of view, the routers which had this source as their primary did not switch to the secondary clock source and instead started to propagate this incorrect time stamp to JT’s other network routers,’ said the report by Thierry Berthouloux, chief technology and information officer.

The IP routers were designed to be resilient to such situations arising, as they all have a local clock to which they can switch. But a protocol to exchange routing information between the IP routers then came into play, which each router will authenticate with its neighbour using a locally stored password.

‘The underlying reason why this incorrect date/time stamp had such a dramatic effect on the routers is explained by an unexpected interplay whereby the local password can only be considered valid by the IP router from an explicit configured date in the router of July 1st 2012 (which we believe to be the date that our first Cisco IP routers were deployed).

‘As the date transmitted (27/11/2000) was earlier than the password validity start date (01/07/2012) the router stopped working as it no longer had a valid password to communicate with its neighbours.’

Fifteen of JT’s 100 routers received the wrong date and isolated themselves from the rest of the network, said the report. ‘By doing so, they made 35 other routers unreachable. Thus, having lost around half of all the network the inherent resilience and redundancy of the network design was lost, and the network failed.’

Among the impacted routers, two terminate JT’s submarine cable connections to the UK (London) and another one to France (Paris). Four other affected routers are used as gateways to the company’s ‘geo-redundant’ mobile network core systems located in Jersey and Guernsey.

‘In order to restore service, our engineers had to physically attend the multiple sites where the routers are located. We needed to manually change the time on each affected router to replace the incorrect date. This took considerable time especially to reach the routers located outside of the Channel Islands. Our last router in Paris was corrected on July 13th at 16:00 BST.’

Most Channel Islands services were restored once the time was updated on the isolated routers, but it took up to another 36 hours for all international located devices to reconnect. This then led to a spike of activity. Some telco partners ‘interpreted these spikes as abnormal and suspicious behaviour and automatically shut down the links to JT as a precaution’.

On 15 July, work was undertaken to disable the time-based password mechanism and a new ‘method of procedure’ implemented and completed at 5am on 17 July.

‘Following this change, we are now confident our network is no longer vulnerable to the propagation of a wrong time/date stamp through our servers and a repeat of the incident is therefore impossible.’

The report also said: ‘Based on our investigations, the cause is a hardware failure in the NTP server which caused a card to reset back to its original factory parameters.’

Graeme Millar, JT’s chief executive officer, said the disruption was the worst in the company’s 132-year history.

‘We have never encountered anything like this incident before, and neither has our equipment supplier, Cisco, who work with firms like JT all over the world,’ he said.

‘It’s important to note that throughout the incident, 999 calls made from any mobile device, worked as normal.’

Mr Millar welcomed a decision by the Jersey competition regulator to independently assess the measures JT had taken, saying it shared the objective of ensuring resilient networks.

JT’s board has also appointed former CFO John Kent to conduct his own review and report back to it.

You need to be logged in to comment. If you had an account on our previous site, you can migrate your old account and comment profile to this site by visiting this page and entering the email address for your old account. We'll then send you an email with a link to follow to complete the process.