Unfortunately there isn't a way to link them that I've found (yet). Usernames might not be identical and such too. The script injects the old data into the BeyondCombustion.net Lemmy SQL database directly all under a fake/bot account name, or Redarc displays it in an HTML/webpage format like in the example link.
It's mostly to have a way to post the entire back content which came up on Google searches for people new to vapes or, "vape curious" as one might say. There's some cool DIY projects, health related stuff, general neat history that would be a bummer to lose in /r/Vaporents.
Reddit had threatened to take the sub down due to links and stuff going to vendors or people wanting to sell used vapes for the last couple years. This is a way to hedge the bets with that content so it comes up in future google searches down the road and helps out those who haven't moved "BeyondCombution" hehe yet.
So you would put it under a different, archival community? That doesn't sound like a bad idea. We don't know that it will last on reddit, and it opens up the content to be accessed outside of reddit. It could draw users to FC and to lemmy.
On the other hand, starting fresh can be nice. The archive is inseparable from reddit and their influence. One example: All of the posts in the archive abide by reddit's admin rules, and it may not reflect the opinions of the mods/maintainers/community. So this could also be an opportunity to hit reset.
Either a separate archive community or just right into vaporents on BeyondCombustion + a bot to scrape new posts from the Reddit RSS feed and repost them here (but under a bot account).
I think it would be more engaging for lemmy at first to have posts pulled in at the beginning at least, until conversation takes place more naturally
Absolutely going to host it 💯. Just trying to decide exactly what format…. If it works well I’d be willing to do the same for other subreddits, I’ll be documenting the steps but having PostgreSQL access on the lemmy server is a non starter for most subreddits/people/mods; unless they have their own infrastructure too.
I like the idea of a separate community but at the same time there is definitely value in the continuity of transitioning the sub (I assume reddit will kill it sooner or later tbh).
I think I get an idea of the scope of the task. I agree it's probably too intensive for most communities, but I'm sure others would be interested. What's involved with getting the archive? Is it scraped, or something you can download as a mod? I can think of a community or two that might appreciate a new home away from their mods...
So far, there's like ~5-6TB of archives I've downloaded through the links in that post, others on /r/DataHoarder, the-eye.eu, etc.
They go back allll the way to 2005.... It's just text though, no media. Unless the media was a link to Imgur or YouTube or something, then the links are in the posts.
There's a couple bots/scripts that will repost new stuff moving forward from RSS feeds.
To inject directly from those backups into a Lemmy PostgreSQL database, I was using this tool, RedditLemmyImporter. Which, actually looks like it was made by the lemmy developer dessalines moving the r/GenZhou subreddit into lemmy.ml/lemmygrad.ml originally but forked very early to try to obscure that..... so I do feel a bit dirty about that, and more so about Lemmy in general from some of the things I've seen from dessalines themselves.
TBH..... I think it's important to make a Fork of Lemmy itself, and to really really comb through this code base. Not sure if this is a long term solution if dessalines is still the head dev, one of the reasons I'm setting up other forums on this server as well. Lemmy and non-corp/federated social media is good, but I'm not liking the stewards of the lemmy code base the more I read their direct words/actions/history/and code like this importer.
Also, if there are communities out there that you're interested in helping do this they have to either
have their own server
or
get direct access to the PostgreSQL database on whatever new server they want to setup a community in.
I'm sure there are other ways to import it, like scripting something that literally re-posted everything through API calls to Lemmy or ugh clicking through the web GUI lmao. Without that database access I'd consider it too much work/hassle to be practical.