Incident History

View Incident History

Current Status

May 14 - Shared Windows Server 21 MySQL Database Maintenance Details

Completed [09/05/2024 09:35]

May 09 - Power Issue in DC5 Details

Resolved [09/05/2024 23:19]

May 07 - Telehouse North POP - Non Service Impacting Maintenance Details

Completed [09/05/2024 09:35]

April 18 - Linux Hosting Server Upgrades Details

Completed [18/04/2024 21:29]

The upgrades are now complete.

March 28 - London Data Center Network Maintenance Details

Completed [10/04/2024 09:25]

March 14 - Emergency Hosting Details

Completed [20/03/2024 11:35]

March 07 - CIX Mail Servers Details

Resolved [08/03/2024 09:05]

The issues with the CIX mail platform were fully resolved yesterday evening. Email should be working as expected now. We apologies for the disruption.

February 27 - BT Wholesale Connection Drops Details

Resolved [29/02/2024 17:16]

== Reason for Outage Summary ==

During routine configuration updates, unforeseen repercussions occurred in an unrelated segment of our network, leading to disconnections to approximately 30% of our BT Wholesale based broadband connections. Leased lines were not impacted. Upon identification, the configuration changes were promptly reverted, initiating service restoration. However, due to the inherent nature of PPP connections, some customer devices experienced delays in reconnecting, resulting in a number of lingering stale sessions.

== Response and Mitigation ==

The incident has been attributed to a potential bug and has been escalated to our vendor's Technical Assistance Center (TAC) for thorough investigation. Following the restoration process, the service has stabilised, and we have no expectations of a recurrence.

February 26 - Critical Security Updates on Servers 30-71 Details

Completed [26/02/2024 12:04]

The work is complete.

January 31 - Critical Security Updates on Servers 30-71 Details

Completed [31/01/2024 10:58]

The work is complete.

November 24 - Windows Server 19 Upgrade 26/11/2023 8PM GMT Details

Resolved [29/11/2023 16:29]

November 21 - IMPORTANT: Shared Windows Server 21 Upgrade 21/11/2023 Details

Completed [21/11/2023 22:21]

November 21 - Network Maintenance 21/11/2023 Details

Completed [21/11/2023 20:45]

November 14 - Control Panel Maintenance 14/11/2023 Details

Completed [20/11/2023 13:05]

November 06 - Control Panel / API Maintenance Details

Completed [06/11/2023 23:52]

Work has now been completed

October 03 - Critical Security Updates on Servers 30-71 Details

Completed [03/10/2023 11:00]

The work is now complete.

September 12 - Control Panel Maintenance 12/09/2023 Details

Completed [12/09/2023 20:06]

September 08 - Emergency Telehouse North PSU Replacement Details

Completed [12/09/2023 20:06]

September 07 - Critical Security Updates on Servers 30-71 Details

Completed [07/09/2023 11:17]

The work is now complete.

September 05 - Control Panel Maintenance 05/09/2023 Details

Completed [06/09/2023 15:09]

This was postponed.

September 01 - Control Panel Maintenance Details

Completed [03/09/2023 08:26]

September 01 - Critical Security Updates on Servers 30-71 Details

Completed [01/09/2023 12:07]

The work is now complete.

August 11 - DC5 Data Centre Network Issue Details

Resolved [24/08/2023 16:04]

== Root Cause ==

On Friday 11th August at 14:40, monitoring systems detected a significant issue with traffic routing via the Data Centre's DDoS mitigation solution, triggering a Major Incident response. Core network devices in DC5 and THN2 London data centres were failing to handle traffic as expected. The service disruption was caused by a routing problem within the DC5 London Data Centre. Under normal operating conditions traffic would have routed via an additional resilient London Data Centre. However, a failure by a third-party supplier meant that the route to the resilient Data Centre was unavailable for the full duration of the incident.

The Data Centre encountered a significant issue pertaining to the routing of traffic by the Data Centre's DDOS mitigation solution. This was a complex issue resulting in a lengthy investigation process across multiple appliances in the DC5 data centre. The Data Centre's investigations confirmed the issue was in the network layer and therefore made the necessary amendments leading to service restoration.

Customers may have experienced disruptions in DNS services for domain names hosted with our network. Our DNS servers, namely ns1.interdns.co.uk and ns2.interdns.co.uk, are typically hosted in separate data centers within London, each on distinct IP ranges. These servers are designed to ensure uninterrupted DNS service, but as a result of this incident spanning both centres, services were impacted.

== Next Steps ==

The data centre has undertaken internal reviews. The root cause was analysed and their technical teams defined a detailed action plan, which includes an immediate review of appliance configuration, software upgrades, resiliency validation and process improvements.

We have undertaken a strategic initiative to enhance our DNS infrastructure. Our plan includes expanding our presence into additional data centers and establishing two entirely independent network setups. These measures are intended to safeguard against any future occurrences of similar disruptions, ensuring the continued reliability of our DNS services.

We apologise for the disruption and inconvenience this has caused you and your customers and appreciate your patience and understanding during this time.

July 26 - Critical Security Updates on Servers 30-71 Details

Completed [26/07/2023 11:40]

The work is now complete.

June 30 - Critical Security Updates on Servers 30-71 Details

Completed [30/06/2023 11:43]

The work is now complete.

June 07 - Control Panel Maintenance Details

Completed [07/06/2023 22:09]

The work completed and service was restored just before 10PM.

June 05 - Control Panel Emergency Maintenance Details

Resolved [07/06/2023 12:27]

May 12 - Control Panel Access Details

Resolved [12/05/2023 16:31]

Control panel service has now been restored. Apologies for the disruption this afternoon.

May 03 - Broadband RADIUS Visibility Details

Resolved [03/05/2023 17:11]

The issue has now been resolved. Apologies for the disruption caused.

April 21 - Critical Security Updates on Servers 64-71 Details

Completed [21/04/2023 06:30]

The work is now complete.

April 19 - Critical Security Updates on Servers 30-43 Details

Completed [19/04/2023 09:46]

The work is complete.

April 18 - Security Updates on Server Platform Details

Completed [19/04/2023 17:00]

March 06 - Critical Security Updates on Servers 30-71 Details

Completed [06/03/2023 10:50]

The work is now complete.

January 20 - Emergency Switch Reload @ DC5 Details

Completed [20/01/2023 21:43]

The planned works were successful. Total downtime was contained to less than 4 minutes.

January 19 - Internal Traffic Between Servers Details

Resolved [20/01/2023 20:59]

December 16 - Telehouse West Leased Line Emergency Outage Details

Resolved [21/12/2022 10:47]

In summary, here are details of the issue observed on Friday afternoon / evening:

- It was observed that there was a significant and unexpected memory leak on core equipment in our Telehouse West (THW) core.
- It was determined that the best course of action was to carry out a controlled reload out of hours.
- We began slowly culling broadband sessions terminating at THW and steering them to other PoPs in preparation.
- A short time later the memory exhausted on the THW core, the BGP process terminated and resulted in all broadband sessions on LNSs at the PoP disconnecting.
- All broadband circuits that were operating via THW were automatically steered to other PoPs in our network.
- At this point we had no choice but to carry out an emergency reload of the core.
- Leased lines operating from THW were impacted throughout.
- Reload of the core took 30 minutes to complete, however a secondary issue was identified with the hardware of one of the switches.
- Half of the leased lines were restored, whilst on-site hands moved the affected NNIs from the failed switch to the other. This involved configuration changes.
- Circuits were impacted between 1 hour and 4 hours at worst. The majority of circuits were up around the 1 to 2 hour point.
- We are not set to move the NNIs again, to ensure that there is no further disruption.
- Owing to fulfilment issues the replacement hardware is now expected to arrive today, but to avoid any further risks, installation will be postponed until the New Year.
- We have raised the memory leak issue with Cisco TAC.

We apologise for the disruption this would have caused.

December 16 - Connectivity Issues Details

Resolved [21/12/2022 21:26]

August 21 - Dark Fibre Maintenance at LD8 Details

Completed [21/08/2022 11:10]

August 11 - Critical Security Updates on Servers 60-71 Details

Completed [11/08/2022 10:35]

The work is now complete.

June 01 - Critical Security Updates on Servers 30-43 Details

Completed [01/06/2022 08:30]

The work is now complete.

March 31 - TalkTalk API Issues Details

Resolved [14/04/2022 11:03]

March 29 - Power at LD8 Data Centre Details

Resolved [14/04/2022 11:03]

March 10 - Telehouse West Outage Details

Resolved [14/03/2022 10:21]

The issue remains with our NOC team and Cisco.

February 11 - Control Panel / Database Servers Maintenance Details

Completed [14/02/2022 14:06]

February 11 - Control Panel Access Details

Resolved [11/02/2022 14:25]

Control panel access has been restored.

February 11 - Network At Risk Details

Resolved [15/02/2022 23:09]

Issue resolved.

February 01 - TalkTalk API Outage Details

Resolved [01/02/2022 17:29]

We are now able able to access TalkTalk services without issue. Apologies for the disruption this may have caused.

January 28 - Windows Server Reboot [28/01/2022] Details

Completed [29/01/2022 14:58]

September 15 - Server 21 Maintenance Details

Resolved [17/11/2021 15:38]

September 13 - Control Panel Access Details

Resolved [13/09/2021 10:13]

The issue has been identified and resolved. Apologies for any disruption caused.

August 24 - Control Panel & Shared Email Access Details

Resolved [13/09/2021 10:06]

August 18 - Linux Dedicated/VPS Kernel Updates Details

Completed [18/08/2021 12:22]

A new significant bug has been found impacting Ubuntu servers:

https://ubuntu.com/security/notices/USN-5039-1

We're patching all Linux Dedicated and VPS servers in the coming days which have our control panel installed. This is going to be a fairly simple task, however a REBOOT will be required, so please don't be concerned should you notice your server go down. We are trying to ensure patches are rolled out as quickly as possible, so apologies if this impacts working hours.

If you don't have a managed server, with the control panel enabled, we encourage you to apply the necessary patch ASAP.

If you have any further questions please get in touch with our support team quoting the server number.

August 13 - Critical Security Updates on Servers 60-63 Details

Completed [13/08/2021 15:27]

July 21 - Critical Security Updates on Linux Hosting Details

Completed [22/07/2021 09:28]

July 02 - Network Disruption Details

Resolved [02/07/2021 13:44]

Whilst investigating a degraded performance issue on a dark fibre at our LD8 PoP, a third party engineer inadvertently disconnected another dark fibre that connects LD8 to a third location. This subsequently resulted in LD8 becoming isolated from the rest of the network for a short period, between 00:06:02 and 00:09:56.

As previously reported, during this time leased line circuits terminating at LD8 would have experienced a loss of connectivity. Broadband circuits were impacted further due to a large number of subscriber sessions that were terminating at LD8 disconnecting.

Whilst the majority of the affected broadband subscribers regained a session at another PoP relatively quickly, others whose sessions were steered to a particular aggregation router on the network failed to start. Our engineers investigated and discovered that the router was experiencing a fault condition and took it out of service. At this point the vast majority of remaining subscribers re-gained their sessions.

Apologies for the disruption this may have caused.

June 13 - Telehouse West POP Issue Details

Resolved [14/06/2021 10:00]

At 12:32:23 on 13/06/21 a supervisor in a core switch at our THW PoP experienced an inexplicable reboot. Shortly afterwards at 12:32:40 a hot standby supervisor took over the active role and restored the overall connectivity to the PoP.

The original active supervisor that rebooted was back in service as a hot standby by 12:41:52. By 12:54:47 it had brought all its line cards online following a full and successful diagnostics run. All connectivity was restored to the site by this point.

Non-resilient leased line circuits that terminate on NNIs directly connected to the rebooted supervisor would have experienced an outage between 12:32:23 and 12:54:47.

All other non-resilient leased line circuits as well as any broadband circuits that were terminating at THW would have seen a loss of connectivity between 12:32:23 and 12:32:40.

We have raised this to the vendor's TAC for further investigation. The device is currently stable and not showing any signs of issues. As such we do not deem the site to be at further risk at this time.

Apologies for the disruption this may have caused.

May 11 - Network Performance at DC5 Data Centre Details

Resolved [11/05/2021 09:33]

The issue was traced to a peering device that we've now taken offline and full service has been restored. Apologies for the disruption that this may have caused some users this morning.

May 10 - Network Performance at DC5 Data Centre Details

Resolved [10/05/2021 17:54]

The issue has been resolved. The cause remains under investigation. Apologies for any disruption to service you may have experienced.

April 30 - Network Issue Affecting Broadband and Leased Line Circuits Details

Resolved [10/05/2021 12:58]

April 23 - Network Performance at DC5 Data Centre Details

Resolved [30/04/2021 09:21]

The cause was linked to a denial of service attack. We apologise for the disruption experienced.

March 26 - Shared Windows Server 21 Upgrade 29/03/2021 Details

Completed [22/03/2021 14:15]

As part of our ongoing efforts to improve our web hosting platform, we are pleased to inform you that we will be upgrading our Shared Windows Server 21 from Windows 2012 to Windows 2019, bringing with it performance improvements, security enhancements, automatic malware scanning of web sites and the introduction of the http/2 protocol for sites with SSL certificates.

This will work will take place on Friday 26th March 2021 and begin at 7PM (19:00 GMT) and should take no longer than four hours to complete. During this time, access to all web sites on Server 21 will be unavailable

March 24 - Database Maintenance 24/03/2021 Details

Completed [26/03/2021 11:29]

March 19 - Shared Windows Server 19 Planned Upgrade Details

Completed [22/03/2021 14:15]

March 06 - Shared SQL Server Upgrade - Saturday 6th March 2021 Details

Completed [15/03/2021 15:07]

March 06 - Control Panel / CIX Forums Issue Details

Resolved [06/03/2021 22:04]

February 12 - Network Issue Affecting DC5 Data Center Details

Resolved [12/02/2021 17:24]

We believe the reboot has resolved the issue.

February 11 - Network Maintenance 11/02/2021 9PM GMT Details

Completed [12/02/2021 14:43]

February 11 - Hosting Network Maintenance 10/02/2021 21:00 GMT Details

Completed [10/02/2021 09:50]

February 02 - Emergency Switch Reboot - 9PM 02/02/2021 Details

Resolved [10/02/2021 09:46]

February 02 - Emergency Switch Reboot - 9PM 02/01/2021 Details

Resolved [02/02/2021 13:19]

January 28 - Hosting Network Maintenance 28/01/2021 23:00GMT - 06:00AM. Details

Completed [10/02/2021 09:46]

December 29 - Control Panel Maintenance 29/12/2020 Details

Completed [30/12/2020 01:25]

December 21 - Weekend Spam Outbreak 21/12/2020 Details

Resolved [12/01/2021 15:07]

December 14 - Core Network at DC5 Data Centre Details

Resolved [16/12/2020 11:25]

We believe yesterday's networking issues for the hosting data centre are now fully resolved. We have waited for the dust to settle before giving the all clear. Apologies again for any disruption this has caused. Once we receive a full explanation into the cause, we will provide this as soon as possible.

December 09 - Server 38 - Emergency Disk Replacement Details

Resolved [14/12/2020 05:35]

November 03 - Control Panel Maintenance 03/11/2020 - 9PM GMT Details

Completed [03/11/2020 22:46]

This work completed.

October 19 - Control Panel Unavailable Details

Resolved [19/10/2020 13:50]

October 17 - Shared Windows Server 21 Upgrade 17/10/2020 Details

Completed [03/11/2020 22:46]

October 04 - Control Panel and VPS Platform Maintenance Details

Completed [15/10/2020 12:05]

September 01 - Server21 Networking Issue Details

Resolved [01/09/2020 10:16]

This issue has now been resolved.

August 30 - Problems with Internet Routing / CenturyLink Details

Resolved [30/08/2020 21:25]

The issue was resolved around 16:10. CenturyLink responded via Twitter to say:

<-- snip -->

We are able to confirm that all services impacted by today’s IP outage have been restored. We understand how important these services are to our customers, and we sincerely apologize for the impact this outage caused.

<-- snip -->

Although we and their other global customers withdrew routes and shut down peering sessions, they continued to announce them to their peers regardless. This caused black holing of any inbound traffic routed via CenturyLink. All affected customers were left powerless and it has been a case of having to wait for them resolve the issue.

Thankfully less than 10% of our overall traffic routes in via CenturyLink's network, so the impact was minimal. We know of only a small handful of destinations that were unreachable during their outage. Apologies if your access was disrupted.

August 21 - Shared Hosting - Server 37 Details

Resolved [21/08/2020 17:29]

A faulty network has been replaced and service restored. Apologies for the delay in resolution, it wasn't obvious that the card may have been at fault.

August 19 - Hosting Outage Details

Resolved [04/10/2020 13:46]

July 24 - Shared Windows Server 19 Details

Completed [18/08/2020 05:05]

July 10 - Shared Windows Servers 21 and 19 Details

Completed [17/07/2020 11:47]

July 08 - Control Panel Maintenance Details

Completed [10/07/2020 22:53]

July 04 - Server Maintenance and Rack Move in London Data Centre. Details

Completed [06/07/2020 10:37]

June 22 - Control Panel Details

Resolved [04/07/2020 21:44]

June 13 - Control Panel Details

Completed [13/06/2020 10:36]

This work completed successfully.

June 13 - Windows Server 21 Details

Resolved [19/06/2020 13:51]

June 04 - Windows Server 19 Details

Resolved [13/06/2020 06:29]

June 03 - VPS Outage Details

Resolved [13/06/2020 06:33]

June 03 - Control Panel and Shared SQL Down Details

Resolved [13/06/2020 06:33]

April 22 - Control Panel Maintenance Details

Completed [05/05/2020 10:36]

March 26 - Network issue affecting hosting and mail services Details

Resolved [26/03/2020 16:42]

A network routing issue has been resolved. All services are working as expected now.

March 26 - Network issue affecting hosting and mail services Details

Resolved [26/03/2020 10:57]

A network routing issue has been resolved. All hosting and email services are functioning as expected now.

February 19 - Networking Issue Details

Resolved [19/02/2020 13:10]

The issue was related to LINX (the London Internet Exchange), which has now been resolved and would have potentially affected several Internet providers in the UK. We are awaiting a full RFO from them to confirm the cause.

August 30 - Emergency DC5 Core Router Upgrade Details

Completed [30/08/2019 00:10]

The upgrade was successful and cleared the fault condition as suspected. We have been monitoring for the past hour and have not seen any further instability.

July 25 - VPS Platform Details

Resolved [25/07/2019 15:42]

The issue with the VPS platform has been resolved and it was due to a denial of service attack against the platform. We have worked closely with our mitigation service and transit provider to ensure this will not happen again.

We have also investigated with them why this incident did not get detected in the manner it should have been, and this was due to a configuration issue at their end. Assurances have been provided this has been rectified. In addition to this, we are looking more closely at network level to see what further and additional protection could be put in place to further prevent this from occurring again.

Our apologies for the disruption this has caused.

July 19 - Emergency VPS Maintenance Details

Resolved [19/07/2019 11:25]

The maintenance is complete now and services are back up and running.

July 10 - VPS Platform Details

Resolved [10/07/2019 12:40]

The attack has now been isolated and brought under control, so normal service show now be witnessed. We apologise for the disruption witnessed this morning. We are continuing to monitor the platform carefully.

June 24 - VPS Platform Details

Resolved [10/07/2019 09:10]

June 01 - VPS Platforem Details

Resolved [01/06/2019 15:30]

We believe the issue with the VPS platform to be resolved. We will be monitoring the service closely over the next few hours to ensure all continues to be well.

April 10 - Network Issue Affecting DC5 Data Centre Details

Resolved [15/04/2019 15:26]

The problem was localised to 1 rack of servers following a PDU failure.

April 04 - SSL Warnings For Sending / Receiving Mail Details

Resolved [24/06/2019 09:09]

February 25 - Communicating With TalkTalk Business API Details

Resolved [25/02/2019 12:23]

TalkTalk's systems appear to be operational again. However, please consider them to be at risk as we have not received any communication from them to confirm that everything is back to normal.

February 14 - Control Panel Access Details

Resolved [14/02/2019 11:22]

The issue has been resolved. The cause was linked to a connectivity issue between the servers in the cluster.

December 05 - Server 33 Emergency Maintenance Details

Completed [05/12/2018 10:29]

The maintenance is now complete.

October 25 - Server 36 Emergency Maintenance Details

Completed [26/10/2018 01:58]

The server rebuild and data restoration is now complete.

October 10 - Control Panel and Webmail Details

Completed [10/10/2018 23:57]

October 02 - Linux Shared Hosting Emergency Maintenance Details

Completed [02/10/2018 10:19]

The maintenance is now complete.

September 20 - FTP Proxy Issue Details

Resolved [20/09/2018 17:37]

The FTP proxy service is now back online.

September 13 - Domain Renewal and Registration issues Details

Resolved [13/09/2018 17:24]

This issue is fully resolved now.

September 12 - Details

Completed [12/09/2018 15:03]

VPS host VPS1 requires an emergency reboot. We apologize for any disruption caused.

September 09 - Webmail and Control Panel Access Details

Resolved [09/09/2018 22:02]

Earlier this evening, we experienced an issue with our core SQL cluster which prevented access to our webmail interfaces and access to our Control Panel. This was resolved around 21:30 BST.

September 06 - Network Maintenance Affecting Dedicated Servers Details

Resolved [09/09/2018 22:03]

We believe all issues were resolved successfully on Thursday morning.

September 06 - Network Maintenance 06/09/2018 Details

Completed [09/09/2018 22:03]

Work completed around 2AM BST.

September 03 - Shared Hosting Email Access Details

Resolved [03/09/2018 12:50]

Service has been resolved. Apologies for the disription caused. Engineers are working to ensure that this doesn't repeat itself.

August 23 - MySQL on Server 60 Details

Resolved [23/08/2018 09:30]

MySQL has been restarted and the issue has been resolved.

August 22 - Shared Hosting Email Access Details

Resolved [22/08/2018 13:22]

The issue has been resolved. Mail is starting to flow again, although there may be a slight bottleneck for the platform to process. Apologies for the disruption witnessed.

August 15 - Linux Shared Hosting Emergency Maintenance Details

Completed [22/08/2018 13:24]

July 05 - Network Connectivity Issues Details

Resolved [05/07/2018 16:54]

This issue appears to be resolved now, although we are monitoring the situation carefully.

Reason For Outage - TalkTalk identified an issue with a third party peering provider, Iomart, who incorrectly advertised a subnet. This was a highly unusual event, and hopefully one we never see repeated.

Current Status