I’m trying to set up a personal Lemmy instance, and I’ve got it running but it doesn’t seem to sync very well with posts and comments made before the instance was created. I ran [lemmony] (https://github.com/jheidecker/lemmony) to get the /all to work correctly and to start syncing communities, but now when I go to some communities and I look at the posts created before I subscribed to the community, they either don’t show up or don’t have the correct number of upvotes/comments. Also, when I search for communities, next to the community name is only the number of users from my instance subscribed, not the actual number of subscribers to the community. Is there a way to fix this?
You shouldn’t run this script, at least not in its default config. It’s more work, but a much better approach is hitting lemmyverse.net and manually subscribing to a bunch of communities you’re interested in.
Unless you’re very careful managing defederation and blocks, it subscribes you to thousands of communities you’ll never read (including the deep/dark parts of the lemmyverse posting porn/loli, piracy, and hate-speech) that will be cached on your server, re-served to the public internet from your instance, and may have legal repercussions in your jurisdiction.
It also increases the federation load your server generates by 50x or more compared to a “normal” single-user instance that subs to 100 communities or so… which doesn’t make you a very good fediverse citizen at a time when federation is being flaky and overloaded throughout the lemmyverse.
This is expected. To a first approximation, subscribing to a community asks the sending server to forward a copy of every post/comment/vote from now on. There’s no significant historical backfill (although there are a few ways to get your instance to download a particular old post, or a handful of them).
But when you first subscribe, you’d expect to be missing old posts, and then to have posts right on the border have some comments missing depending on when they were made. New posts and comments should generally show up in an orderly fashion, except for the global issues with federated replication that cause many servers to struggle to stay exactly in sync.
This is expected, subscriber counts are not federated. You need to visit the community’s instance to see the global sub count.
That does make sense, I’ll probably go through and do that. I just wish there was a better way of sorting by all and historical syncs
Yeah, it’s confusing if you’re not real steeped in how ActivityPub works… and admittedly not real intuitive.
If you choose to wipe the existing subs you made with the tool, you might want to delete the user you used to subscribe with them. I don’t think wiping your db or server will tell the remote servers to unsub you. But I think deleting the user will clear all their subscriptions (not 100% sure though).
The normal way to do this would be to unsub every community, but I don’t think this tool handles rolling back like that.
It’s also worth noting that there’s an upper limit on the number of communities you choose to federate with, while there doesn’t seem to be an upper limit on the blocked communities
I understand the goal of the tool, the defaults are a really bad approach at achieving it and the docs are really bad at identifying the pitfalls. A tool that subscribes to a list of communities provided in a text file would be great. Subscribing to the entire lemmyverse is a solution that creates problems that are worse than the discovery problem.
All of this content seems fairly clearly to me to fall into the category “content that can cause legal liability for the hoster depending on their jurisdiction”. Is that a controversial point of debate?
This all sounds eminently reasonable. 800 subs is a lot, but it’s much more reasonable than the 7k subs this tool leaves you with in it’s default config, and if you further curate it manually and that’s what it takes for your feed to feel lively… then go for it.
Maybe consider releasing it? I totally agree that community discovery is rough all over, and moreso on tiny instances. A tool to help folks bootstrap 50-200 communities and that did a good job documenting the tradeoffs of oversubscription and helped folks identify/avoid legal risk would be a huge step up from the “subscribe all” approach.
Content is NOT served from the original instance. https://lemmy.world/post/1191149 shows a post that was made to a community on
lemmy.ml
. Because there are subscribers onlemmy.world
, that post is replicated there. Any unauthenticated user on the internet can view that post, the content was pulled out of the db onlemmy.world
and sent out fromlemmy.world
’s ip and over its internet connection. By every legal definition I’ve encounteredlemmy.world
is serving that post and subject to any legal complications that entails. The only exception I’m aware of are full-size images, which don’t replicate. Thumbnails do though, so that provides no protection. You host the image content, just at reduced quality.This is also not true. In the US, you have to register a copyright agent to receive the kinds of protection typically associated with commercial hosts. If you fail to do so, I believe that you run the risk of just getting sued out of the gate for copyright issues. There are also almost certainly jurisdictions where hosting gay porn or certain political speech is a “straight to jail” kind of maneuver.
Of course, I have no evidence that OP is in a particularly dangerous jurisdiction. But my broader point is that new users of single-user instance often don’t consider that they may be signing up to host legally risky content that they themselves didn’t create, view, or want. If one curates their list of subs, they can gauge for themselves what communities they consider to be risky. If they “subscribe all”, they WILL be serving to the unauthenticated public internet the worst of the lemmyverse without realizing it… which is an entirely avoidable situation.
They offered an option to limit the sub count, but the default is still unlimited. They seem aggressively against more sensible defaults in other posts.
OP didn’t expect to be missing old posts, hence his question. I had the same surprising discovery. Not sure how the UX could be improved to convey to the user what is actually happening.