How to speed up accessing lots of files on another computer? Some kind of local cache?
Title is TLDR. More info about what I'm trying to do below.
My daily driver computer is Laptop with an SSD. No possibility to expand.
So for storage of lots n lots of files, I have an old, low resource Desktop with a bunch of HDDs plugged in (mostly via USB).
I can access Desktop files via SSH/SFTP on the LAN. But it can be quite slow.
And sometimes (not too often; this isn't a main requirement) I take Laptop to use elsewhere. I do not plan to make Desktop available outside the network so I need to have a copy of required files on Laptop.
Therefor, sometimes I like to move the remote files from Desktop to Laptop to work on them. To make a sort of local cache. This could be individual files or directory trees.
But then I have a mess of duplication. Sometimes I forget to put the files back.
Seems like Laptop could be a lot more clever than I am and help with this. Like could it always fetch a remote file which is being edited and save it locally?
Is there any way to have Laptop fetch files, information about file trees, etc, located on Desktop when needed and smartly put them back after editing?
Or even keep some stuff around. Like lists of files, attributes, thumbnails etc. Even browsing the directory tree on Desktop can be slow sometimes.
I am not sure what this would be called.
Ideas and tools I am already comfortable with:
rsync is the most obvious foundation to work from but I am not sure exactly what would be the best configuration and how to manage it.
luckybackup is my favorite rsync GUI front end; it lets you save profiles, jobs etc which is sweet
freeFileSync is another GUI front end I've used but I am preferring lucky/rsync these days
I don't think git is a viable solution here because there are already git directories included, there are many non-text files, and some of the directory trees are so large that they would cause git to choke looking at all the files.
syncthing might work. I've been having issues with it lately but I may have gotten these ironed out.
Something a little more transparent than the above would be cool but I am not sure if that exists?
Any help appreciated even just idea on what to web search for because I am stumped even on that.
Hey I’m replying again directly to your post in the hopes that I can push against some of the advice you’re getting. My intent is to do an end run around arguing with the people making these suggestions because they’re very smart and made them for good reasons but their ideas aren’t necessarily good for you and I don’t want you to have to go through a troublesome recovery like I did and many people on the internet have.
Do not under any circumstances set up raid or zpools for your data drives once you get them inside a case and on the pcie bus somehow.
In these configurations accessing a file requires spinning up all the drives in the array or pool. Not only is that putting wear and tear on your drives, it increases the temperature of the case and draws much more power. Those conditions lead to drive failure. When your drive fails and you have a spare to use in its place, resilvering (the process of using extra data called parity to rebuild the contents of the failed drive on the spare one) will put those exact conditions on your remaining drives.
For people like us, who may not have a hot spare, or great cooling, or an offsite backup, an array like that will set us up for failure rather than resilience.
Please consider using mergerfs or something like it and a snapshot parity system like snapraid instead.
There are very good use cases for the raid and zpool systems that have been brought up, but you aren’t there. I got there at moderate expense and moved away from them.
thanks I appreciate it. I've been around the block enough times to expect maximalist advice in places like this. people who are motivated to be hanging around in a forum just waiting for someone to ask a question about hard drives are coming from a certain perspective. Honestly, it's not my perspective. But the information is helpful in totality even though I'm unlikely to end up doing what any one person suggests.
RAID is something I've seen mentioned over and over again. Every year or two I go reading about them more intentionally and never get the impression it's for me. Too elaborate to solve problems I don't have.