Hi, someone told me that lemmy.world is running a similar setup to mine and that you might have some insight.
For background, I've got my instance running on 3 servers, I've disabled the scheduled tasks on the Lemmy containers that are load balanced (I've got an extra container that does the scheduled tasks).
If you started up the Lemmy containers and they all tried to do a database migration, how do you handle this? Is the best option to just run a single process on an upgrade and wait until the migration is finished before starting up the others?
I'm not running my own Lemmy, but I do run non-trivial systems elsewhere. Things I'd be looking for would be:
Are concurrent migrations actually a problem? If Lemmy does a good job of taking out an exclusive transaction that checks if the migration is needed and then does it, it might be that several containers can race safely and you don't need to do anything. I'd want to review the schema migration code before I trusted this myself though, since few people rely on it.
A mode in Lemmy to do nothing other than run migrations and exit. You could then execute this as a job or exit-on-terminate container during the upgrade while other Lemmy's are down.
Create my own DB migration script based on the Lemmy schema diffs from release tag to release tag.
Is there a config-flag on Lemmy to enable/disable db migrations? I don't have a link handy but I feel like I heard about lemmy.world having a Lemmy container just for doing async jobs, separate from the ones serving requests. It would be nice if one could run this single separate container and let it handle the migrations... but the other Lemmy's would have to be clever enough to wait until the migration is complete.
As you note, run just one Lemmy container for "a while" when upgrading. The issue here is knowing when the migrations are complete, which feels finicky to automate. Though if you do this manually as part of your upgrade process I'm sure it can work ok.