Major incident at WHC.ca regarding hosting (1 Viewing)

theinvestor

DNCanada.ca
Joined
Nov 28, 2020
Topics
121
Posts
1,290
Likes
951
From
Toronto, ON
This has been ongoing for most of the day.

August 28, 2021: a major incident is impacting the availability of web hosting and reseller hosting accounts on several WHC systems in our BHS1 Montreal datacenter. We currently have all hands on deck working on the problem but the situation is serious.

See link below :

https://whc.ca/blog/live-major-incident-in-progress
 
Picture0002.png


This is very serious

Any idea how everything got wiped?
 
Ohh man, as I keep reading I feel for the team at WHC, such a tremendous loss.

They have not announced how the information was wiped from the servers so at this point we can only speculate.
 
It really is inexcusable. It doesn’t matter what happened to these servers but to not have a backup is just bad news.
 
Never count on webhosts, I know for DN CA I use a schedule and downloaded a complete backup daily. Most that could be lost is 1 day.

Anyone running a site can easily do that using cPanel

In fact while they were sorting things out I would already by running on a different server. Propogation time is pretty quick now.
 
Update August 29, 2021 - 10:00AM

Thank you once again for your patience. We continued to work overnight on restoring accounts.

Here are the updates for each of our restored servers so far:
- Rev2 server has been fully recovered and should be functioning normally
- Decarie server has been fully recovered and should be functioning normally
- Rev server was recovered, however, some issues remain

Here are the updates for servers currently in the restoration process:
- Rachel's restoration process should begin within the next hour
- Peel restoration is approaching 30% completion.
- Beaubien restoration is approaching 50% completion.
- Atwater restoration is approaching 20% completion.

Services on the above servers should be fully functional within the next 12-24 hours.

Please note: We will be posting another update at 11AM for the more severely impacted servers.
 
Total disaster..my email and a few sites are completely offline. The updates are making it seem pretty bad. I'm on the 'bernard' server and their not expecting any recovery at all

Accounts on servers with low likelihood of data recovery
The following 5 systems have had their external backup servers also partially destroyed, and we have been as of yet unsuccessful at recovering their data:
- Clark
- Drummond
- Acadie
- Bernard
- Bishop

How does a server get destroyed...fire? flood?
 
Spex said:
How does a server get destroyed...fire? flood?

They have not answered any of that, they need to be transparent if they want this to not seriously impact their reputation.

I was going to move DN.ca hosting there a few months ago but decided not too because I had prepaid hosting for 3 years.
 
I wonder if Sibername and TBR will be affected by all of this
 
Update August 29, 2021 - 11:00AM

Thank you everyone for your patience and support during what has been a very trying 24 hours. Our team has been working tirelessly to restore service to affected clients and have had some success but the news isn’t all good, so here is where we stand.

Servers that have been fully or partially recovered

We have successfully recovered the following servers:
- Decarie
- Rev2
- Rev (some data loss has been reported here)

Accounts on servers with high likelihood of data recovery

We have 4 systems whose external backup servers have not been impacted, and these are actively being used to stream account restorations right now. We expect to be able to fully restore all backed up accounts by Monday evening and hopefully sooner. Here is where they are in the restore process:
- Beaubien (50%)
- Peel (30%)
- Atwater (20%)
- Rachel (1%, restore has just been started)

A full restore process for a system of this size generally takes 24 hours. As soon as an account is restored, its website and email should become available so for most users on these machines you should have your services restored before the end of the day Sunday.

Accounts on servers with low likelihood of data recovery

The following 5 systems have had their external backup servers also partially destroyed, and we have been as of yet unsuccessful at recovering their data:
- Clark
- Drummond
- Acadie
- Bernard
- Bishop

For clients on these machines, we are pursuing 3 recovery strategies in parallel:

Re-upload your content from your own backup
If you have a local version of your website and/or files on your computer and can re-upload them, please contact support and they will activate a fresh account for you on a new server immediately.
Partial data recovery on production servers
We have had some success in recovering hosting accounts data files (excluding databases) for 2 of the affected servers (Acadie & Bernard for now) and may be able to perform a partial restoration of these accounts within the next 48-72 hours. This process is slow and tedious and unlikely to complete before Tuesday.
Manual data recovery
We’ve contracted an external firm to assist us with data recovery efforts on our backup servers and they started their work on Saturday night. Manual data recovery is a tedious and slow process, but we expect to get a report on what can be salvaged before day’s end. This approach will likely take the longest to result in recovered data, with a timeline as long as 1-2 weeks.
We will be posting more information along with a post-mortem once the immediate operational issues have been addressed.

We have mobilized our full team to assist in this effort and we currently have dozens of system administrators, support technicians and data recovery engineers assisting with the recovery effort since yesterday. We understand this incident may seriously impact your business and are doing everything in our power to mitigate the damage.

More information will follow as we have it.
 
Update August 29, 2021 - 5:00PM

As our team continues to work to restore accounts, please note that live updates are paused until tomorrow morning at 10AM so we can again focus on the restoration efforts. We’ll then be able to give an account of what has been accomplished overnight.

Accounts on servers being restored now

The restoration process is advancing well and we hope to be able to fully recover all accounts on these servers from backups by end of day Monday, and hopefully sooner. Here’s the latest progress:
- Beaubien is at 55%
- Peel is at 35%
- Atwater is at 30%
- Rachel is at 10%

Accounts on servers with damaged backups

The following servers have had both their local storage and their external backup storage heavily damaged, making the recovery process extremely difficult:
- Clark
- Drummond
- Acadie
- Bernard
- Bishop

Our initial attempt to repair the data on our backup servers has failed and at this point the likelihood of successfully restoring account data from these servers is very low.

As a result, we are working with data recovery experts to attempt to restore data from the source server, but this process is extremely long and tedious, potentially requiring file-by-file analysis for systems with millions of files and may require months to complete.

Here is how you can move forward if your account was on one of those servers:

If you have a local backup
If you have a local version of your website and/or files on your computer and can re-upload them, please contact support and they will activate a fresh account for you on a new server immediately.
If you don’t have a local backup: introducing “Lifeboat” accounts
To facilitate continued business operations for our clients during this major outage and until the situation is resolved, we will be activating temporary hosting accounts for all clients hosted on Clark, Drummond, Acadie, Bernard and Bishop servers. You will be able to log in to this new account in order to recreate your emails and upload website content, without hindering ongoing recovery efforts. A simple nameserver change will be required and step-by-step instructions will be provided. These accounts are expected to be available before Monday morning.
 
Update August 29, 2021 - 11:00PM

Accounts on servers with damaged backups

With our initial data recovery process unsuccessful, we’re now recommending that all affected clients that have experienced data loss consider using a new, temporary hosting account that’s already created in their Client Area called LifeBoat.

To use this account, click into it to create your emails and upload website data just as you would on your regular hosting account.

Once this is done, just link your domain to this account, update your domain’s DNS to:

ark1.whc.ca
ark2.whc.ca

Once the DNS changes propagate (can take several hours) your new account can be used to start accepting and sending new email and upload new website copy quickly. These accounts will remain free of charge until at least January 1, 2022 and are intended as a stopgap measure until a more permanent solution is found.

Accounts on servers being restored now

The restoration process is advancing well and we hope to be able to fully recover all accounts on these servers from backups by end of day Monday, and hopefully sooner. Here’s the latest progress:

- Beaubien is at 62%
- Peel is at 45%
- Atwater is at 45%
- Rachel is at 31%
 
Update August 30, 2021 - 9:00AM

The restoration process is advancing for the 4 servers where backups are available:

- Beaubien is at 88%
- Peel is at 65%
- Atwater is at 65%
- Rachel is at 64%

For clients on servers with damaged backups, we recommend you initiate your disaster recovery plan ASAP. We have created hosting accounts in your Client Area called "LifeBoats" to help you get up and running while we explore long-term data recovery options. The impacted servers are:
- Clark
- Drummond
- Acadie
- Bernard
- Bishop
 
Wow…this is a certified mess :eek:
 
The million dollar question is what happened and can the company survive this.

Even the fact that [notify]FM[/notify] has not commented indicates something pretty big.
 
I am thinking I should get my stuff out of Siber, although at this point they seem to be ok.
This is why I was hoping in the takeover, whc wouldn’t try (and glad they haven’t at this point anyway) and mess with Siber’s already pretty successfull TBR system.
 
Major Incident: What happened?

Share this article
Emil Falcon
August 30, 2021

WHC News & Events
It’s been a tough weekend here at WHC and by this, I include our clients. I want to start by thanking all the team for coming together and working through the problem constructively and with tremendous heart and energy.

Here’s the situation.

Based on our investigation to date, the morning of August 28 at approximately 6AM, an individual with a third-party service provider used their privileged account access to connect to one of our datacenter’s management portals and without authorization, initiated server reimaging on some of our backup servers, then on some of our production servers.

Within only hours our incident response team had identified the issue and disabled access to the source account, preventing any further damage. The environment was secured, the individual fully locked out, and our disaster recovery plan immediately kicked into action but damage was already done.

The tally of the incident, however, was and still is important: a few major systems, including some production servers and some backup servers were damaged, with a large number of web hosting and reseller hosting accounts affected, resulting in possible permanent data loss.

After nearly 2 days of tedious work and a combination of external datacenter backup restores and system-level storage rebuilding, our team was able to successfully recover (or is in the process of recovering) over 50% of those lost accounts. We can confirm that Cloud, Dedicated, Weebly and Managed WordPress accounts were largely unaffected.

Unfortunately, at the moment, I can tell you that several production servers and their respective backup servers are still in an unrecoverable state and the data recovery experts assisting us believe that the recovery potential is very low. As such, the focus for these accounts has shifted from data recovery to starting fresh on new, clean accounts. In parallel we will continue to attempt to recover any data we can.

For clients impacted by this incident and for which we are unable to recover backups:

If you have an adequate local backup: contact our support team and we will get you up and running on a new account ASAP
If you do not have a local backup: You will need to start over from a bare, empty account. To this end, we have activated new, free hosting accounts for each impacted domain, called LifeBoat. They are available in your Client Area now and are intended to serve as a quick, free and immediate way for you to start over. These LifeBoat accounts will remain free of charge until at least January 1, 2022.
On behalf of WHC, I would like to extend our sincere apologies to those affected by this unfortunate situation. With the cooperation of those particular clients affected by the incident, we believe that we can greatly minimise the consequences stemming from this involuntary incident.

We remain committed as ever to providing you with quality and affordable hosting solutions. We understand and regret the impact that this incident may have on your business and operations.

We are also grateful and moved by the outpouring of support we have received as we continue working to tackle this issue.

Sincerely,

Emil Falcon
CEO at Web Hosting Canada
 
theinvestor said:
Based on our investigation to date, the morning of August 28 at approximately 6AM, an individual with a third-party service provider used their privileged account access to connect to one of our datacenter’s management portals and without authorization, initiated server reimaging on some of our backup servers, then on some of our production servers.

Server reimaging - does that not mean trying to initiate a backup from a previous server image?

If that was the case the server should be in a restored state to an earlier date right?

Not sure if this provides any answers, was it malicious intent or was it an accident.

WHC needs to be fully transparent with this and then state how an event like this can be prevented in the future. Anything else will cause uncertainty and probably an exodus of clients.

In all plain English, what happened and how can it be prevented from ever happening again.
 

Sponsors who contribute to keep dn.ca free for everyone.

Sponsors who contribute to keep dn.ca free.

Members who recently read this topic: 1

Back