Well, here my story, might it be useful to others too.
I have a home server with 6Tb RAID1 (os on dedicated nvme). I was playing with bios update and adding more RAM, and out of the blue after the last reboot my RAID was somehow shutdown unclean and needed a fix. I probably unplugged the power chord too soon while the system was shutting down containers.
Well, no biggie, I just run fsck and mount it, so there it goes:
"mkfs.ext4 /dev/md0"
Then hit "y" quickly when it said "the partition contains an ext4 signature blah blah" I was in a hurry so...
Guess what? And read again that command carefully.
Too late, I hit Ctrl+c but already too late. I could recover some of the files but many where corrupted anyway.
Lucky for me, I had been able to recover 85% of everything from my backups (restic+backrest to the rescue!) Recreate the remaining 5% (mostly docker compose files located in the odd non backupped folders) and recovered the last 10% from the old 4Tb I replaced to increase space some time ago. Luckly, that was never changing old personal stuff that I would have regret losing, but didn't consider critical enough on backup.
The cold shivers I had before i checked my restic backup discovering that I didn't actually postponed backup of those additional folders...
Today I will add another layer of backup in the form of an external USB drive to store never-changing data like... My ISOs...
This is my backup strategy up to yesterday,
I have backrest automating restic:
1 local backup of the important stuff (personal data mostly)
1 second copy of the important stuff on an USB drive connected to an openwrt router on the other side of home
1 third copy of the important stuff on a remote VPS
And since this morning I have added:
a few git repos (pushed and backup in the important stuff) with all docker compose, keys and such (the 5%)
an additional USB local drive where I will be backup ALL files, even that 10% which never changes and its not "important" but I would miss if I lost it.
Tools like restic and Borg and so critical that you will regret not having had them sooner.
Setup your backups like yesterday. If you didn't already, do it now.
In 2010 I self hosted a Xen hypervisor and used it for everything I could. It was fun!
I had a drive failure in my main raid 5 array so bought a new disk. When it came to swap it out, I pulled the wrong disk.
It was hardware raid using an Adaptec card and the card threw the raid out and refused to bring it together again. I couldn't afford recovery. I remember I just sat there, dumb founded, in disbelief. I went through all the stages of grief.
I was in the middle of a migration from OVH to Hetzner and it occurred at a time where I had yet to reconfigure remote backups.
I lost all my photos of our first child. Luckily some of them were digitised from developed physical media which we still had. But a lot was lost.
This was my lesson.
I now have veeam, proxmox backup server, backuppc and Borg. Backups are hosted in 2 online locations and a separate physical server elsewhere in the house.
Backups are super, SUPER important. I've lost count how many times I've logged into backuppc to restore a file or folder because I did a silly. And it's always reassuring how easy it is to restore.
I have all my photos on my NAS whith backups to OVH storage for recent data and AWS glacier for folders of pictures older than two years. I can afford to lose all my ISOs but not all the pictures of my child. Never tried to recover data from AWS glacier but I trust them.
For 280GB on Glacier : around 1 USD each month
For 400GB of hot storage on OVH public cloud : around 5 EUR per month.
In my process I have to sort pictures and video before sending them to cold storage because I don’t want to cold-store all the failed footages, obviously I have some delay here. That is something I usually do during the long winter evenings 😊
Tools like restic and Borg and so critical that you will regret not having had them sooner.
100000%
I just experienced this when a little mini PC I bought <2y ago cooked its nvme and died this month. Guess who has been meaning to set up backups on that guy for months?
Unfortunately, that nvme is D. E. D. And even more unfortunately, that had a few critical systems for the local network (like my network controller/DHCP). Thankfully it was mostly docker containers so the services came up pretty easy, but I lost my DBs so configs and data need to be replicated :(
The first task on the new box was figuring out and automating borg so this doesn't happen again. I also set up backups via my new proxmox server, so my VMs won't have that problem too.
Now to do the whole 'actually testing the backups' thing.
Plus, even if you manage to never, ever have a drive fail, accidentally delete something that you wanted to keep, inadvertently screw up a filesystem, crash into a corruption bug, have malware destroy stuff, make an error in writing it a script causing it to wipe data, just realize that an old version of something you overwrote was still something you wanted, or run into any of the other ways in which you could lose data...
You gain the peace of mind of knowing that your data isn't a single point of failure away from being gone. I remember some pucker-inducing moments before I ran backups. Even aside from not losing data on a number of occasions, I could sleep a lot more comfortably on the times that weren't those occasions.
I think these kind of situations are where ZFS snapshots shine: you're back in a matter of seconds with no data loss (assuming you have a recent snapshot before the mistake).
Edit: yeah no, if you operate at the disk level directly, no local ZFS snapshot could save you...
I'm not sure, I read that ZFS can help in the case of ransomware, so I assumed it would extend to accidental formatting but maybe there's a key difference.