Skip to main content
  • Industry Solutions
    • Managed Service Providers
    • Enterprise Solutions
    • Developers & Startups
    • Healthcare
    • Trading and Financial
      • Chicago Managed Trading Servers
      • Trading and Financial Colocation: Chicago & New Jersey
    • IBM AS/400 and iSeries Users
  • Support
    • Register
    • View Tickets
    • Submit a Ticket
    • Knowledgebase
    • News
  • Steadfast Blog
  • Steadfast Podcasts
  • Contact Us
Home
  • Call Us
  • Call | 888.281.9449
  • Login
  • Search

This form logs you into your management portal account. To access your help desk account, click here and use the form to the right of the news.

  • Cloud Hosting
    • Cloud Hosting
    • Private Cloud
    • Hybrid Cloud
    • Public Cloud
    • Cloud Storage
      • Secure File Share
      • Wasabi Cloud Storage
    • Virtual Data Center Platform
  • Managed Hosting
    • Bare Metal Dedicated Servers
      • Deep Learning GPU Dedicated Servers
      • Linux Dedicated Servers
      • Windows Dedicated Servers
    • Virtual Private Servers
    • Data Center Colocation
      • Managed Colocation
      • Chicago: 350 E Cermak
      • Chicago: 725 S Wells
      • Edison, New Jersey
    • Security & Compliance
      • Managed Firewall
      • SSL VPN
      • DDoS Protection
      • Email Security
  • Backup & Disaster Recovery
    • Backup
    • Disaster Recovery
    • Veeam Backup & Replication
    • Veeam Cloud Connect
    • Wasabi Cloud Storage
  • Why Steadfast
    • Why Steadfast?
    • About Steadfast
      • Our History
      • News and Press
    • Data Centers & Network
      • Our Data Centers
      • Our Network
      • Network Test
      • Peering Policy
    • Customer Stories
    • Service Level Agreement
  • Industry Solutions
    • Managed Service Providers
    • Enterprise Solutions
    • Developers & Startups
    • Healthcare
    • Trading and Financial
      • Chicago Managed Trading Servers
      • Trading and Financial Colocation: Chicago & New Jersey
    • IBM AS/400 and iSeries Users
  • Support
    • Register
    • View Tickets
    • Submit a Ticket
    • Knowledgebase
    • News
  • Steadfast Blog
  • Steadfast Podcasts
  • Contact Us
Close
Return to All Blog Posts
GitLab’s Recent Outage Shows The Importance Of Verified Backups For Disaster Recovery

GitLab’s Recent Outage Shows The Importance Of Verified Backups For Disaster Recovery

March 8, 2017 in
Disaster Recovery

If you ask a system administrator what their worst nightmare is, you’ll get a variety of answers, but high on the list will be the moment they realise the “rm” command they just ran is deleting the wrong data. That’s essentially what happened to an unfortunate engineer at GitLab, resulting in downtime and the loss of production data.

The problem was compounded by a lack of good backups, which made it impossible for GitLab to recover quickly and meant some of the accidentally deleted data was gone forever.

GitLab is a SaaS version control platform much like GitHub. Although it’s nowhere near as popular, GitLab has some useful features that GitHub lacks, including nicer issue tracking, which are leading to increasing adoption for open source projects. As you can imagine, losing data is probably the worst thing that can happen to a version control platform.

You can read the full incident post mortem on GitLab’s blog, but in a nutshell, under heavy load from suspected spammers, replication of GitLab’s primary database to a redundant secondary was failing. Engineers attempted to manually run the replication process, but it repeatedly failed. Suspecting that the replication was failing because of stale files on the secondary server, the engineer decided to delete those files. Unfortunately, the “rm” command was run on the primary server, and many gigabytes of data were lost before the engineer realised his mistake.

It’s worth mentioning that GitLab’s handling of the incident was a paragon of transparency and openness. The company immediately began communicating with users about the cause of the downtime and kept communicating throughout the incident.

Less impressive was the disaster recovery plan that GitLab had in place. Deleting a massive chunk of production data is a very bad thing, but it needn’t be catastrophic. With proper backups in place, it shouldn’t take more than a few minutes to sync the deleted files back to the production server. In this case, when GitLab’s engineers looked in the S3 bucket that was supposed to contain their database backups, the cupboard was bare. The backup scripts were using a database dump tool that didn’t support the database version they were using, and the backups failed silently.

We all make mistakes. I doubt there’s a system administrator or IT professional reading this who hasn’t accidentally deleted the wrong thing. It happens. But, knowing that it happens, processes should be put in place to ensure that recovery is straightforward. It’s great that GitLab had backup processes, but an unverified backup is worthless. Regular backup verification should be part of every company’s disaster recovery and business continuity plan.

Share This
facebook twitter email compact

Comments (0)

Leave a Comment

Get an image next to your comment by visiting Gravatar.com and uploading a profile photo that links to your address.

Search the Blog

Categories

backup
(1)
bare metal
(1)
Business Talk
(23)
Chicago
(11)
Chicago colocation
(1)
Cloud
(34)
cloud backup
(1)
cloud services
(4)
colocation
(5)
colocation services
(1)

Archives

  • March 2023 (1)
  • August 2022 (1)
  • March 2022 (3)
  • October 2021 (1)
  • January 2021 (1)
  • July 2020 (1)
  • June 2020 (1)
  • April 2020 (1)
  • March 2020 (1)
  • August 2019 (1)

Follow Us

  • Facebook
  • Twitter
  • LinkedIn
  • RSS Feed
  • 312.602.2689
  • ColoHouse Sales
  • Facebook
  • Twitter
  • YouTube
  • LinkedIn

Services

  • Cloud Hosting
  • Managed Hosting
  • Backup & Disaster Recovery

Solutions By Industry

  • Enterprise Solutions
  • Trading & Financial
  • Healthcare
  • Developers & Startups
© 2023 Steadfast
  • Log In
  • Site Map
  • Legal Info & Privacy Policy