Like Reddit is? e.g. for Google, or Bing (shudders), you know. Search engines. One of the ways many people around the world interacted with Reddit was looking up solutions, discussions, or similar from a search engine and NOT on Reddit itself. Is that possible in this thread of the fediverse?

  • sethboy66@kbin.social
    link
    fedilink
    arrow-up
    20
    ·
    1 year ago

    It’s certainly archivable; all one must do is look at the ‘robots.txt’ (a file that websites use to let nice search engines know which pages they shouldn’t index) associated with the domain to find out what it permits to be indexed. Lemmy.world’s robots.txt only disallows pages associated with instance/account creation, user settings, and administrator/authorized interaction.

    So everything relevant to how reddit appears on Google is possible for Lemmy, the only difference is that Lemmy’s associated PageRank (and other ranking scores) are considerable lower than reddit’s. This should change with time, especially when more niche and specialized communities take hold.

    • MattMist@kbin.social
      link
      fedilink
      arrow-up
      11
      ·
      edit-2
      1 year ago

      That’s true, but aren’t federated pages at a disadvantage since you can look at them from any instance thus decreasing the number of links to one specific post (which is how PageRank works)? Since then instead of one post on page 1 you’d have 10 from different instances on page 3. I’m thinking this could be fixed if all posts had a link to the post on the original instance, which is where the ranking scores would then be more likely to aggregate.

      • sethboy66@kbin.social
        link
        fedilink
        arrow-up
        7
        ·
        edit-2
        1 year ago

        That’s a good point, and I’m sure that would certainly be a problem with PageRank and similar ranking algorithms, but I wouldn’t be entirely surprised if Google and other SEs have intelligently crafted a pre-processor that translates links like “kbin.social/m/[email protected]/t/34817/Is-Lemmy-Indexable” to the Original-Instance-Link (OIL, lurking Google devs feel free to steal this acronym) “https://lemmy.world/post/189226” so that relevant algorithms properly reflect the ‘true’ ranking of the information itself rather than the particular instance’s… instance of it.

        OStatus and Pump.io have been around for a while so SEs may (should) have already identified this problem and addressed it unless they’ve decided it’s not important, not in-line with how their rankings are intended to work, or simply not easily solvable in some cases like I previously assumed. As Bjarne Stroustrup would say, “If you think it’s simple, then you have misunderstood the problem.”

        • silas@programming.dev
          link
          fedilink
          arrow-up
          4
          ·
          1 year ago

          There are <meta> HTML tags and <link rel=“canonical” href=“https://example.com/sample-page/”> tags as well that point to the original copy of a page, if it is not implemented it would be super easy to, but I’m on my phone at the moment so I can’t see the source code

      • jcg@halubilo.social
        link
        fedilink
        arrow-up
        4
        ·
        1 year ago

        All posts and comments do have a link to the instance they were originated from. That’s what that weird looking multicolour star is (the fediverse logo).