this post was submitted on 26 Feb 2026

111 points (98.3% liked)

Selfhosted

56958 readers

898 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

selfh.st Newsletter and index of selfhosted software and apps
awesome-selfhosted software
awesome-sysadmin resources
Self-Hosted Podcast from Jupiter Broadcasting

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago

MODERATORS

HybridSarcasm@lemmy.world

HybridSarcasm@lemmy.hybridsarcasm.xyz

111

How do you effectively backup your high capacity (20+ TB) local NAS? (lemmy.world)

submitted 22 hours ago* (last edited 22 hours ago) by NekoKoneko@lemmy.world to c/selfhosted@lemmy.world

78 comments fedilink hide all child comments

I have a 56 TB local Unraid NAS that is parity protected against single drive failure, and while I think a single drive failing and being parity recovered covers data loss 95% of the time, I'm always concerned about two drives failing or a site-/system-wide disaster that takes out the whole NAS.

For other larger local hosters who are smarter and more prepared, what do you do? Do you sync it off site? How do you deal with cost and bandwidth needs if so? What other backup strategies do you use?

(Sorry if this standard scenario has been discussed - searching didn't turn up anything.)

top 50 comments

sorted by: hot top controversial new old

[–] PieMePlenty@lemmy.world 3 points 1 hour ago

Not all data is equal. I backup things i absolutely can not lose and yolo everything else. My love for this hobby does not extend to buying racks of hard drives.

[–] INeedMana@piefed.zip 1 points 41 minutes ago

I've been following this post since the first comment.

And I have just put together my own RAID1 1TB NAS. And I did not think that 1TB will serve me forever, more like "a good start".

But the numbers I've been seeing in here... you guys are nuts 😆

[–] randombullet@programming.dev 2 points 4 hours ago

I have 3 main NASes

78TB (52TB usable) hot storage. ZFS1

160TB (120TB) warm storage ZFS2

48TB (24TB) off site. ZFS mirror

I rsync every day from hot to off site.

And once a month I turn on my warm storage and sync it.

Warm and hot storage is at the same location.

Off site storage is with a family friend who I trust. Data isn't encrypted aside from in transit. That's something else I'd like to mess with later.

Core vital data is sprinkled around different continents with about 10TB. I have 2 nodes in 2 countries for vital data. These are with family.

I think I have 5 total servers.

Cost is a lot obviously, but pieced together over several years.

The world will end before my data gets destroyed.

[–] dmention7@midwest.social 25 points 12 hours ago (1 children)

Personally I deal with it by prioritizing the data.

I have about the same total size Unraid NAS as you, but the vast majority is downloaded or ripped media that would be annoying to replace, but not disastrous.

My personal photos, videos and other documents which are irreplaceable only make up a few TB, which is pretty managable to maintain true local and cloud backups of.

Not sure if that helps at all in your situation.

[–] Burninator05@lemmy.world 1 points 8 hours ago

I have data that I actually care about in RAIDZ1 array with a hot standby and it is syched to the cloud. The rest (the vast majority) is in a RAIDZ5. If I lose it, I "lose" it. Its recoverable if I decide I want it again.

[–] GenderNeutralBro@lemmy.sdf.org 15 points 14 hours ago (2 children)

You'll think I'm crazy, and you're not wrong, but: sneakernet.

Every time I run the numbers on cloud providers, I'm stuck with one conclusion: shit's expensive. Way more expensive than the cost of a few hard drives when calculated over the life expectancy of those drives.

So I use hard drives. I periodically copy everything to external, encrypted drives. Then I put those drives in a safe place off-site.

On top of that, I run much leaner and more frequent backups of more dynamic and important data. I offload those smaller backups to cloud services. Over the years I've picked up a number of lifetime cloud storage subscriptions from not-too-shady companies, mostly from Black Friday sales. I've already gotten my money's worth out of most of them and it doesn't look like they're going to fold anytime soon. There are a lot of shady companies out there so you should be skeptical when you see "lifetime" sales, but every now and then a legit deal pops up.

I will also confess that a lot of my data is not truly backed up at all. If it's something I could realistically recreate or redownload, I don't bother spending much of my own time and money backing it up unless it's, like, really really important to me. Yes, it will be a pain in the ass when shit eventually hits the fan. It's a calculated risk.

I am watching this thread with great interest, hoping to be swayed into something more modern and robust.

[–] irmadlad@lemmy.world 3 points 11 hours ago* (last edited 10 hours ago)

That is old-old-school. It works tho. You have to be a bit scheduled about it, to encompass current and future important data. IIRC AWS created a 100 petabyte drive and a truck to haul it around to basically do the same thiing, just in much larger amounts.

[–] MightyLordJason@lemmy.world 3 points 11 hours ago

Sneakernet crew here too. My work offsite backup is in my backpack. Few times per week I do a sync which takes a few minutes and take it home again. (The sync archives old versions of files and the drive is encrypted.)

We tried several cloud-based solutions and they were all rather expensive or just plain hard to run to completion or both.

[–] quick_snail@feddit.nl 2 points 9 hours ago

Tape or backblaze

[–] Shadow@lemmy.ca 78 points 22 hours ago (5 children)

I don't. Of my 120tb, I only care about the 4tb of personal data and I push that to a cloud backup. The rest can just be downloaded again.

[–] NekoKoneko@lemmy.world 12 points 22 hours ago (7 children)

Do you have logs or software that keeps track of what you need to redownload? A big stress for me with that method is remembering or keeping track of what is lost when I and software can't even see the filesystem anymore.

[–] tal@lemmy.today 15 points 20 hours ago* (last edited 19 hours ago) (2 children)

I don't know of a pre-wrapped utility to do that, but assuming that this is a Linux system, here's a simple bash script that'd do it.

#!/bin/bash

# Set this.  Path to a new, not-yet-existing directory that will retain a copy of a list
# of your files.  You probably don't actually want this in /tmp, or
# it'll be wiped on reboot.

file_list_location=/tmp/storage-history

# Set this.  Path to location with files that you want to monitor.

path_to_monitor=path-to-monitor

# If the file list location doesn't yet exist, create it.
if [[ ! -d "$file_list_location" ]]; then
    mkdir "$file_list_location"
    git -C "$file_list_location" init
fi

# in case someone's checked out things at a different time
git -C "$file_list_location" checkout master
find "$path_to_monitor"|sort>"$file_list_location/files.txt"
git -C "$file_list_location" add "$file_list_location/files.txt"
git -C "$file_list_location" commit -m "Updated file list for $(date)"

That'll drop a text file at /tmp/storage-history/files.txt with a list of the files at that location, and create a git repo at /tmp/storage-history that will contain a history of that file.

When your drive array kerplodes or something, your files.txt file will probably become empty if the mount goes away, but you'll have a git repository containing a full history of your list of files, so you can go back to a list of the files there as they existed at any historical date.

Run that script nightly out of your crontab or something ($ crontab -e to edit your crontab).

As the script says, you need to choose a file_list_location (not /tmp, since that'll be wiped on reboot), and set path_to_monitor to wherever the tree of files is that you want to keep track of (like, /mnt/file_array or whatever).

You could save a bit of space by adding a line at the end to remove the current files.txt after generating the current git commit if you want. The next run will just regenerate files.txt anyway, and you can just use git to regenerate a copy of the file at for any historical day you want. If you're not familiar with git, $ git log to find the hashref for a given day, $ git checkout <hashref> to move where things were on that day.

EDIT: Moved the git checkout up.

load more comments (2 replies)

[–] Sibbo@sopuli.xyz 24 points 22 hours ago (2 children)

If you can't remember what you lost, did you really need it to begin with?

Unless it's personal memories of course.

[–] Onomatopoeia@lemmy.cafe 11 points 21 hours ago* (last edited 21 hours ago) (5 children)

I can't remember the name of an excel spreadsheet I created years ago, which has continually matured with lots of changes. I often have to search for it of the many I have for different purposes.

Trusting your memory is a naive, amateur approach.

[–] a_non_monotonic_function@lemmy.world 5 points 14 hours ago

If the spreadsheet is important it sounds like it would be part of the 4 GB that was backed up.

[–] ExcessShiv@lemmy.dbzer0.com 6 points 20 hours ago

The key here being that you actually remember the file exists, because it's important. Some other random spreadsheet you don't even remember exists because you haven't needed it since forever is probably not all that important to backup.

If you loose something without ever realizing you lost it, it was not important so there would be no reason to make a backup.

[–] three@lemmy.zip 1 points 15 hours ago

Psst, you missed the point and need to re-read the thread.

load more comments (2 replies)

load more comments (1 replies)

[–] kurotora@lemmy.world 16 points 22 hours ago

In my case, for Linux ISOs, is only needed to login in my usual private trackers and re-download my leeched torrents. For more niche content, like old school TV shows in local language, I would rely in the community. For even more niche content, like tankoubons only available at the time on DD services, I have a specific job but also relying in the same back up provider that I'm using for personal data.

Also, as it's important to remind to everyone, you must encrypt your backup no matter where you store it.

[–] BakedCatboy@lemmy.ml 9 points 21 hours ago (1 children)

My *arrstack DBs are part of my backed up portion, so they'll remember what I have downloaded in my non-backed up portion.

load more comments (1 replies)

load more comments (3 replies)

load more comments (4 replies)

[–] Cyber@feddit.uk 4 points 14 hours ago

What's your recovery needs?

It's ok to take 6 months to backup to a cloud provider, but do you need all your data to be recovered in a short period of time? If so, cloud isn't the solution, you'd need a duplicate set of drives nearby (but not close enough for the same flood, fire, etc.

But, if you're ok waiting for the data to download again (and check the storage provider costs for that specific scenario), then your main factor is how much data changes after that initial 1st upload.

[–] Mister_Hangman@lemmy.world 1 points 11 hours ago

Definitely following this

[–] billwashere@lemmy.world 5 points 16 hours ago (1 children)

With another large NAS.

[–] Cyber@feddit.uk 3 points 14 hours ago (1 children)

In a different location

[–] billwashere@lemmy.world 1 points 13 hours ago

Well I personally have about 50tb, with one local copy and one remote copy but I’m very lucky to have access to old enterprise storage.

[–] unit327@lemmy.zip 4 points 15 hours ago* (last edited 15 hours ago) (1 children)

I use aws s3 deep archive storage class, $0.001 per GB per month. But your upload bandwidth really matters in this case, I only have a subset of the most important things backed up this way otherwise it would take months just to upload a single backup. Using rclone sync instead of just uploading the whole thing each time helps but you still have to get that first upload done somehow...

I have complicated system where:

borgmatic backups happen daily, locally
those backups are stored on a btrfs subvolume
a python script will make a read-only snapshot of that volume once a week
the snapshot is synced to s3 using rclone with --checksum --no-update-modtime
once the upload is complete the btrfs snapshot is deleted

I've also set up encryption in rclone so that all the data is encrypted an unreadable by aws.

[–] quick_snail@feddit.nl 1 points 9 hours ago

Don't do this. It's a god damn nightmare to delete

[–] danielquinn@lemmy.ca 5 points 16 hours ago (1 children)

Honestly, I'd buy 6 external 20tb drives and make 2 copies of your data on it (3 drives each) and then leave them somewhere-safe-but-not-at-home. If you have friends or family able to store them, that'd do, but also a safety deposit box is good.

If you want to make frequent updates to your backups, you could patch them into a Raspberry Pi and put it on Tailscale, then just rsync changes every regularly. Of course means that wherever youre storing the backup needs room for such a setup.

I often wonder why there isn't a sort of collective backup sharing thing going on amongst self hosters. A sort of "I'll host your backups if you host mine" sort of thing. Better than paying a cloud provider at any rate.

[–] Joelk111@lemmy.world 4 points 14 hours ago* (last edited 14 hours ago)

That NAS software company Linus (of Linus Tech Tips) funded has a feature for this planned I think.

An open-source standalone implementation would be dope as hell. Sure, it'd mean you'd need to double your NAS capacity (as you'd have to provide enough storage as you use), but that's way easier than building a second NAS and storing/maintaining it somewhere else or constantly paying for and managing a cloud backup.

[–] irmadlad@lemmy.world 11 points 19 hours ago (4 children)

I'm not sure if I qualify as a 'larger local hoster' but I would go through your 20 TB and decide what really is important enough to backup in case the wheels fall off. Linux ISOs, those can be re-downloaded, although it would take a bit of time. The things that can't be readily downloaded such as my music collection that I have been accumulating for decades, converted to flac, and meticulously tagged, can't be re-downloaded. So that is one of my priorities to back up. Pictures, business documents, personal documents, can't be re-downloaded, so that goes on the 'must back up' list....and so on. Just cull out what is and isn't replaceable. I would bet that once you do that, your 20 TB will be a bit more slim, and you're not trying to push 20TB up the pipe to a cloud backup.

I use BackBlaze's Personal, unlimited tier for $99 USD per year, which is a pretty sweet deal. One thing about Backblaze to remember is that the drives being backed up must be physically connected to the PC doing the backup/uploading. I get around that because I have a hot swap bay on my main PC, but there are other methods and software that will masquerade your NAS or other as a physically connected drive.

[–] cmnybo@discuss.tchncs.de 2 points 17 hours ago (1 children)

Backblaze personal doesn't support Linux or BSD, so it would be useless for a NAS.

load more comments (1 replies)

load more comments (3 replies)

[–] kaotic@lemmy.world 4 points 15 hours ago (2 children)

Backblaze offers unlimited data on a single computer, $99/year.

There might be some fine print that excludes your setup but might be worth investigating.

https://www.backblaze.com/cloud-backup/pricing

[–] Mister_Hangman@lemmy.world 1 points 11 hours ago

Oh shit.

[–] unit327@lemmy.zip 2 points 15 hours ago (2 children)

only windows (maybe mac)

[–] Joelk111@lemmy.world 2 points 14 hours ago (1 children)

Yeah, people have done workarounds and stuff to get their entire NAS backed up but those seemed sketchy and bad when I looked into it.

[–] osanna@lemmy.vg 1 points 3 hours ago* (last edited 3 hours ago)

if you break their TOS, you'll likely lose your data. So.... be careful. Mind you, I haven't read their TOS, so i don't know if those work arounds are breaking their TOS.

[–] irmadlad@lemmy.world 1 points 13 hours ago

Wine or there is a Docker container that runs the Backblaze client.

[–] Brkdncr@lemmy.world 13 points 21 hours ago (1 children)

Backup to 2nd nas.

Important stuff gets backed up to cloud storage. Whatever is cheapest.

In my case Synology c2 cloud was cheapest.

[–] raicon@lemmy.world 1 points 6 hours ago

c2 seems expensive, I would go with hetzner storage box + restic

[–] worhui@lemmy.world 8 points 20 hours ago* (last edited 9 hours ago)

Lto tape. But I only have 15tb

It quickly becomes cost effective when you actually need the data to be safe. Far easier to have off site backups. I have never had a problem , but I like to have offline backup. Most of the time my data is static. So I am only backing up projects files ans changes for the most part.

If you have 40+ tb of dynamic data I can’t help there.

Edit: I buy used drives that are usually 2 generations old, so I got lto-5 drives when lto 7 was new. The used drives may be less reliable but used drives can be 1/10th the price of the newest ones.

[–] MentalEdge@sopuli.xyz 9 points 22 hours ago* (last edited 22 hours ago) (9 children)

Recently helped someone get set up with backblaze B2 using Kopia, which turned out fairly affordable. It compresses and de-duplicates leading to very little storage use, and it encrypts so that Backblaze can't read the data.

Kopia connects to it directly. To restore, you just install Kopia again and enter the same connection credentials to access the backup repository.

My personal solution is a second NAS off-site, which periodically wakes up and connects to mine via VPN, during that window Kopia is set to update my backups.

Kopia figures out what parts of the filesystem has changed very quickly, and only those changes are transferred over during each update.

load more comments (9 replies)

[–] tommij@lemmy.world 1 points 13 hours ago

Zfs send. Done

[–] originalucifer@moist.catsweat.com 8 points 22 hours ago

entire nas (~24TB used) is replicated to another nas in another building (2 actually). i like having 3 copies.

[–] Treczoks@lemmy.world 4 points 20 hours ago

As someone who has experienced double failure twice in my lifetime, I seriously recommend doing backups.

The problem is that the only serious backup solution is another HDD for this size. A robot array for tapes or worm drives is probably out of budget.

[–] FreedomAdvocate@lemmy.net.au 2 points 17 hours ago

I switched to a DAS for my storage and use backblaze to back up all 50TB+. I couldn’t find a cost effective way to do it with a NAS.

[–] Bishma@discuss.tchncs.de 5 points 21 hours ago* (last edited 21 hours ago)

Like others, I have a 2 tier system.

About 2TB of my (Synology) NAS is critical files. Those get sent via Hyperbackup to cloud storage on at least a weekly basis, some daily. I have them broken up into multiple tasks with staggered schedules so it never has much to do on any given day.

The other 16TB I have get sync'd (again with hyperbackup, but not a scheduled backup task) to a 20TB external drive roughly once per quarter. Then that drive lives on the closet of a family member.

load more comments