• toeblast96@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    8
    ·
    3 hours ago

    tbh i somehow didnt even realize that wikipedia is one of the few super popular sites not trying to shove ai down my throat every 5 seconds

    i’m grateful now

  • lens0021@lemmy.ml
    link
    fedilink
    English
    arrow-up
    10
    ·
    6 hours ago

    He is nobody to Wikipedia now. He also failed to create a news site and a micro SNS.

  • deathbird@mander.xyz
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    2
    ·
    4 hours ago

    Sit down Jimmy. Wikipedia has enough problems already, it doesn’t need more to be added by AI.

    • vacuumflower@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      edit-2
      10 hours ago

      What’s funny is that for enormous big systems with network effects we are trying to use mechanisms intended for smaller businesses, like a hot dog kiosk.

      IRL we have a thing for those, it’s called democracy.

      In the Internet it’s either anarchy or monarchy, sometimes bureaucratic dictatorship, but in that area even Soviet-style collegial rule is something not yet present.

      I’m recently read that McPherson article about Unix and racism, and how our whole perception of correct computing (modularity, encapsulation, object-orientation, all the KISS philosophy even) is based on that time’s changes in the society and reaction to those. I mean, real world is continuous and you can quantize it into discrete elements in many ways. Some unfit for your task. All unfit for some task.

      So - first, I like the Usenet model.

      Second, cryptography is good.

      Third, cryptographic ownership of a limited resource is … fine, blockchains are maybe not so stupid. But not really necessary, because one can choose between a few versions of the same article retrieved, based on web of trust or whatever else. No need to have only one right version.

      Fourth, we already have a way to turn sequence of interdependent actions into state information, it’s called a filesystem.

      Fifth, Unix with its hierarchies is really not the only thing in existence, there’s BTRON, and even BeOS had a tagged filesystem.

      Sixth, interop and transparency are possible with cryptography.

      Seventh, all these also apply to a hypothetical service over global network.

      Eighth, of course, is that the global network doesn’t have to be globally visible\addressable to operate globally for spreading data, so even the Internet itself is not as much needed as the actual connectivity over which those change messages will propagate where needed and synchronize.

      Ninth, for Wikipedia you don’t need as much storage as for, say, Internet Archive.

      And tenth - with all these one can make a Wikipedia-like decentralized system with democratic government, based on rather primitive principles, other than, of course, cryptography involved.

      (Yes, Briar impressed me.)

      EDIT: Oh, about democracy - I mean technical democracy. That an event (making any change) weren’t valid if not processed correctly, by people eligible for signing it, for example, and they are made eligible by a signed appointment, and those signing it are made eligible by a democratic process (signed by majority of some body, signed in turn). That’s that blockchain democracy people dreamed at some point. Maybe that’s not a scam. Just haven’t been done yet.

        • vacuumflower@lemmy.sdf.org
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 hours ago

          How do you use Sybil attack for a system where the initial creator signs the initial voters, and then they collectively sign elections and acceptance of new members and all such stuff?

          Doesn’t seem to be a problem for a system with authorized voters.

            • vacuumflower@lemmy.sdf.org
              link
              fedilink
              English
              arrow-up
              1
              ·
              6 hours ago

              So why would they accept said AI-generated applicants?

              If we are making a global system, then confirmation using some nation’s ID can be done, with removing fakes found out later. Like with IRL nation states. Or “bring a friend and be responsible if they are a fake”. Or both at the same time.

    • Corn@lemmy.ml
      link
      fedilink
      English
      arrow-up
      9
      ·
      15 hours ago

      Wikipedia already has a decades operating cost of savings.

    • hr_@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      ·
      15 hours ago

      I mean, the Wikipedia page does say it was sold in 2018. Not sure how it was before but it’s not surprising that it enshittified by now.

      • OboTheHobo@ttrpg.network
        link
        fedilink
        English
        arrow-up
        4
        ·
        7 hours ago

        I guess in his defense it wasn’t too bad before 2018, as far as I can remember. Most of the enshittification of fandom I can remember has happened since.

        • Ganbat@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          50
          arrow-down
          1
          ·
          18 hours ago

          Fandom (previously Wikia) is an extremely shitty service with low-quality wikis mostly consisting of content copied from independent wikis and a terrible layout that only exists to amplify their overwhelming advertising.

          • Tortellinius@lemmy.world
            link
            fedilink
            English
            arrow-up
            15
            arrow-down
            1
            ·
            15 hours ago

            While this is true, the majority of the wikis are not at all low quality. Some are the only ones existing for a topic. The wikis are community-based, after all.

            But its easy to vandalize and is highly profit-driven. The fandom wikis are filled with ads that absolutely destroy navigation. Infamous is the video ad that scrolls you up automatically in the middle of reading once it finishes. You have to pause it to read the article with no interruption.

        • lime!@feddit.nu
          link
          fedilink
          English
          arrow-up
          17
          ·
          18 hours ago

          they captured the “niche wiki” market as wikia, then rebranded and started serving shittons of ads. the vim wiki is unusable these days because it runs like ass and looks like a gamer rgb nightmare

        • Rose@slrpnk.net
          link
          fedilink
          English
          arrow-up
          3
          ·
          8 hours ago

          Yup, Fallout Wiki has a pretty crazy history. I don’t remember if they were originally a Fandom wiki, but at some point they definitely went “well, we don’t want to go with Fandom, we’ll go with Curse wiki host instead.” Then Fandom bought Curse wikis and put all of them under Fandom banner anyway.

          The independent Fallout Wiki is basically where the actual community is right now, the Fandom wiki is just there to confuse passers-by with their high search engine rank. Fandom has the policy that the community can fork a wiki and go elsewhere, but they will not close down the Fandom wiki, so good luck with your search rankings.

          • Soggy@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 hours ago

            Many game communities have opted for the “unbridled vandalism” strategy to push people away from fandom. Just replace all the articles with plausible lies.

        • interdimensionalmeme@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          3
          ·
          edit-2
          8 hours ago

          The “fandom” one is much more complete ?
          I mean, they’re both pretty great,
          From the search engine if I wanted to know about in-game faction,
          I’d just pick which ever appeared first.
          and it’d be fine either way

          So why would “Chloé 🥕@lemmy.blahaj.zone”
          think they can just point at it and imagine any random people would even know
          what she “who that guy is” means just because he’s associated with that wiki ?

          And that my innocuous comment
          would triggers the nerds with such an unanimously negative response ?

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    172
    arrow-down
    5
    ·
    edit-2
    1 day ago

    Wales’s quote isn’t nearly as bad as the byline makes it out to be:

    Wales explains that the article was originally rejected several years ago, then someone tried to improve it, resubmitted it, and got the same exact template rejection again.

    “It’s a form letter response that might as well be ‘Computer says no’ (that article’s worth a read if you don’t know the expression),” Wales said. “It wasn’t a computer who says no, but a human using AFCH, a helper script […] In order to try to help, I personally felt at a loss. I am not sure what the rejection referred to specifically. So I fed the page to ChatGPT to ask for advice. And I got what seems to me to be pretty good. And so I’m wondering if we might start to think about how a tool like AFCH might be improved so that instead of a generic template, a new editor gets actual advice. It would be better, obviously, if we had lovingly crafted human responses to every situation like this, but we all know that the volunteers who are dealing with a high volume of various situations can’t reasonably have time to do it. The templates are helpful - an AI-written note could be even more helpful.”

    That being said, it still reeks of “CEO Speak.” And trying to find a place to shove AI in.

    More NLP could absolutely be useful to Wikipedia, especially for flagging spam and malicious edits for human editors to review. This is an excellent task for dirt cheap, small and open models, where an error rate isn’t super important. Cost, volume, and reducing stress on precious human editors is. It’s a existential issue that needs work.

    …Using an expensive, proprietary API to give error prone yet “pretty good” sounding suggestions to new editors is not.

    Wasting dev time trying to make it work is not.

    This is the problem. Not natural language processing itself, but the seemingly contagious compulsion among executives to find some place to shove it when the technical extent of their knowledge is occasionally typing something into ChatGPT.

    It’s okay for them to not really understand it.

    It’s not okay to push it differently than other technology because “AI” is somehow super special and trendy.

    • Frezik@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      43
      arrow-down
      1
      ·
      1 day ago

      This is another reason why I hate bubbles. There is something potentially useful in here. It needs to be considered very carefully. However, it gets to a point where everyone’s kneejerk reaction is that it’s bad.

      I can’t even say that people are wrong for feeling that way. The AI bubble has affected our economy and lives in a multitude of ways that go far beyond any reasonable use. I don’t blame anyone for saying “everything under this is bad, period”. The reasonable uses of it are so buried in shit that I don’t expect people to even bother trying to reach into that muck to clean it off.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        23
        arrow-down
        2
        ·
        edit-2
        1 day ago

        This bubble’s hate is pretty front-loaded though.

        Dotcom was, well, a useful thing. I guess valuations were nuts, but it looks like the hate was mostly in the enshittified aftermath that would come.

        Crypto is a series of bubbles trying to prop up flavored pyramid schemes for a neat niche concept, but people largely figured that out after they popped. And it’s not as attention grabbing as AI.

        Machine Learning is a long running, useful field, but ever since ChatGPT caught investors eyes, the cart has felt so far ahead of the horse. The hate started, and got polarized, waaay before the bubble popping.

        …In other words, AI hate almost feels more political than bubble fueled. If that makes any sense. It is a bubble, but the extreme hate would still be there even if it wasn’t.

        • stankmut@lemmy.world
          link
          fedilink
          English
          arrow-up
          20
          ·
          1 day ago

          Crypto was an annoying bubble. If you were in the tech industry, you had a couple of years where people asked you if you could add blockchain to whatever your project was and then a few more years of hearing about NFTs. And GPUs shot up in price. Crypto people promised to revolutionize banking and then get rich quick schemes. It took time for the hype to die down, for people to realize that the tech wasn’t useful, and that the costs of running it weren’t worth it.

          The AI bubble is different. The proponents are gleeful while they explain how AI will let you fire all your copywriters, your graphics designers, your programmers, your customer support, etc. Every company is trying to figure out how to shoehorn AI into their products. While AI is a useful tool, the bubble around it has hurt a lot of people.

          That’s the bubble side. It also gets a lot of baggage because of the slop generated by it, the way it’s trained, the power usage, the way people just turn off their brains and regurgitate whatever it says, etc. It’s harder to avoid than crypto.

          • Baggie@lemmy.zip
            link
            fedilink
            English
            arrow-up
            3
            ·
            16 hours ago

            God I had coworkers that had never used a vr headset claiming the metaverse was going to be the next big thing. I wish common sense was common.

            • Knock_Knock_Lemmy_In@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              ·
              15 hours ago

              “The metaverse” changed it’s definition depending on who you talked to. Some definitions didn’t even include VR.

              “AI” also changes it’s definition depending on who you talk to.

              Vague definitions = hype

          • brucethemoose@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            1
            ·
            edit-2
            1 day ago

            Yeah, you’re right. My thoughts were kinda uncollected.

            Though I will argue some of the negatives (like inference power usage) are massively overstated, and even if they aren’t, are just the result of corporate enshittification more than the AI bubble itself.

            Even the large scale training is apparently largely useless: https://old.reddit.com/r/LocalLLaMA/comments/1mw2lme/frontier_ai_labs_publicized_100kh100_training/

            • badgermurphy@lemmy.world
              link
              fedilink
              English
              arrow-up
              5
              ·
              1 day ago

              I believe that the bad behavior of corporate interests is often one of the key contributors to these financial bubbles in every sector where they appear.

              To say that some of the bad things about this particular financial bubble are because of a bunch of companies being irresponsible and/or unethical seems not to acknowledge that one is primarily caused by the other.

      • peoplebeproblems@midwest.social
        link
        fedilink
        English
        arrow-up
        10
        ·
        1 day ago

        So… I actually proposed a use case for NLP and LLMs in 2017. I don’t actually know if it was used.

        But the usecase was generating large sets of fake data that looked real enough for performance testing enterprise sized data transformations. That way we could skip a large portion of the risk associated with using actual customer data. We wouldn’t have to generate the data beforehand, we could validate logic with it, and we could just plop it in the replica non-prodiction environment.

        At the time we didn’t have any LLMs. So it didn’t go anywhere. But it’s always funny when I see all this “LLMs can do x” because I always think about how my proposal was to use it… For fake data.

    • Pringles@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      62
      ·
      1 day ago

      That being said, it still wreaks of “CEO Speak.”

      I think you mean reeks, which means to stink, having a foul odor.

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      12
      arrow-down
      2
      ·
      1 day ago

      That being said, it still wreaks of “CEO Speak.” And trying to find a place to shove AI in.

      I don’t see how this is “shoved in.” Wales identified a situation where Wikipedia’s existing non-AI process doesn’t work well and then realized that adding AI assistance could improve it.

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        14
        arrow-down
        2
        ·
        edit-2
        1 day ago

        Neither did Wales. Hence, the next part of the article:

        For example, the response suggested the article cite a source that isn’t included in the draft article, and rely on Harvard Business School press releases for other citations, despite Wikipedia policies explicitly defining press releases as non-independent sources that cannot help prove notability, a basic requirement for Wikipedia articles.

        Editors also found that the ChatGPT-generated response Wales shared “has no idea what the difference between” some of these basic Wikipedia policies, like notability (WP:N), verifiability (WP:V), and properly representing minority and more widely held views on subjects in an article (WP:WEIGHT).

        “Something to take into consideration is how newcomers will interpret those answers. If they believe the LLM advice accurately reflects our policies, and it is wrong/inaccurate even 5% of the time, they will learn a skewed version of our policies and might reproduce the unhelpful advice on other pages,” one editor said.

        It doesn’t mean the original process isn’t problematic, or can’t be helpfully augmented with some kind of LLM-generated supplement. But this is like a poster child of a troublesome AI implementation: where a general purpose LLM needs understanding of context it isn’t presented (but the reader assumes it has), where hallucinations have knock-on effects, and where even the founder/CEO of Wikipedia seemingly missed such errors.

        Don’t mistake me for being blanket anti-AI, clearly it’s a tool Wikipedia can use. But the scope has to be narrow, and the problem specific.

  • iopq@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    22 hours ago

    Honestly, translating the good articles from other languages would improve Wikipedia immensely.

    For example, the Nanjing dialect article is pretty bare in English and very detailed in Mandarin

    • Echo Dot@feddit.uk
      link
      fedilink
      English
      arrow-up
      16
      ·
      edit-2
      15 hours ago

      You can do that, that’s fine. As long as you can verify it is an accurate translation, so you need to know the subject matter and the target language.

      But you could probably also have used Google translate and then just fine tune the output yourself. Anyone could have done that at any point in the last 10 years.

      • lunarul@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 hours ago

        As long as you can verify it is an accurate translation

        Unless the process has changed in the last decade, article translations are a multi-step process, which includes translators and proof-readers. It’s easier to get volunteer proof-readers than volunteer translators. Adding AI for the translation step, but keeping the proof-reading step should be a great help.

        But you could probably also have used Google translate and then just fine tune the output yourself. Anyone could have done that at any point in the last 10 years.

        Have you ever used Google translate? Putting an entire Wikipedia article through it and then “fine tuning” it would be more work than translating it from scratch. Absolutely no comparison between Google translate and AI translations.

        • Echo Dot@feddit.uk
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 hours ago

          Putting an entire Wikipedia article through it and then “fine tuning” it would be more work than translating it from scratch.

          That depends on if you are capable of translating the language if you don’t know the language then the translator will give you a good start.

      • iopq@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        13 hours ago

        Google translate is horrendously bad at Korean, especially with slang and accidental typos. Like nonsense bad.

        • kazerniel@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          11 hours ago

          Same in Hungarian, machine translation still often gives hilariously bad results. It’s especially prone to mixing up formal and informal ‘you’ within the same paragraph, something which humans never do. At least it’s easy to tell when a website is one of those ‘auto-translated to 30 languages’ content mill.

    • SkunkWorkz@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      16 hours ago

      I recently have edited a small wiki page that was obviously written by someone that wasn’t proficient in English. I used AI to just reword what was already written and then I edited the output myself. It did a pretty good job. It was a page about some B-list Indonesian actress that I just stumbled upon and I didn’t want to put time and effort into it but the page really needed work done.

    • graphene@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      1
      ·
      12 hours ago

      Wikipedia’s translation tool for porting articles between languages currently uses google translate so I could see an LLM being an improvement but LLMs are also way way costlier than normal translation models like google translate. Would it be worth it? And also would the better LLM translations make editors less likely to reword the translation to make it’s tone better?

      • iopq@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 hours ago

        You can use an LLM to reword the translation to make the tone better. It’s literally what LLMs are designed to do

    • captainastronaut@seattlelunarsociety.org
      link
      fedilink
      English
      arrow-up
      46
      arrow-down
      1
      ·
      1 day ago

      Because this is one of the rare times he sat down at the keyboard to do the real work being done by people in this organization and he realized that it’s hard and he wants a shortcut. He sees his time as more valuable and sees this task as wasting his time, but it is their primary task and one they do as volunteers because they are passionate about it. He’s not going to get a lot of traction with them telling them the thing they do for free because they love it isn’t worth anyone’s time.

      • ronigami@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        15 hours ago

        I swear these people have never been around a cathedral and thought about how it was built.

      • Aatube@kbin.melroy.org
        link
        fedilink
        arrow-up
        26
        arrow-down
        2
        ·
        1 day ago

        I think commenters here don’t actually do Wikipedia. Wales was instrumental in Wikipedia’s principles and organization besides the first year of Sanger. He handpicked the first administrators to make sure the project would continue its anarchistic roganization and prevent a hierarchy from having a bigger say in content matters.

        I would characterize Wales as a long-retired leader rather than leadership.

    • Storm@slrpnk.net
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      1 day ago

      Because that’s what being in a position of power does to a mf

  • Carvex@lemmy.world
    link
    fedilink
    English
    arrow-up
    51
    arrow-down
    4
    ·
    1 day ago

    Remember you can download all of Wikipedia in your language and safely store it on a drive buried in your backyard, for after they rewrite history and eliminate freedom of speech.

    • TommySoda@lemmy.world
      link
      fedilink
      English
      arrow-up
      27
      ·
      1 day ago

      Already got it downloaded. It’s only like 100 - 150 gigabytes or something like that. Got it on my PC, my laptop, and my external hard drive. I don’t trust the powers that be to keep it intact anymore so I’d rather have my own copy, even if outdated.

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      6
      arrow-down
      2
      ·
      1 day ago

      What about any of this remotely connects to “rewriting history and eliminating freedom of speech?”

      • LainTrain@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        1
        ·
        1 day ago

        Proprietary AI means corpo involvement, and usually it’s the really actively awful sort of techbros, this involvement gives them some power, and this power is a threat. Whether it materializes or not, living in the world we do now, it’s only right to be wary. I already figured Wikipedia was on its way out a few months ago and downloaded both the kiwi program reader version and the raw xml dump + file for truly apocalyptic situations.

        • FaceDeer@fedia.io
          link
          fedilink
          arrow-up
          5
          arrow-down
          2
          ·
          1 day ago

          There are lots of non-proprietary AI models out there, some of them comparable in quality to ChatGPT. Wikipedia could run it themselves if they wanted, no “corpo involvement.”

  • logicbomb@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    2
    ·
    1 day ago

    The problem with LLMs and other generative AI is that they’re not completely useless. People’s jobs are on the line much of the time, so it would really help if they were completely useless, but they’re not. Generative AI is certainly not as good as its proponents claim, and critically, when it fucks up, it can be extremely hard for a human to tell, which eats away a lot of their benefits, but they’re not completely useless. For the most basic example, give an LLM a block of text and ask it how to improve grammar or to make a point clearer, and then compare the AI generated result with the original, and take whatever parts you think the AI improved.

    Everybody knows this, but we’re all pretending it’s not the case because we’re caring people who don’t want the world to be drowned in AI hallucinations, we don’t want to have the world taken over by confidence tricksters who just fake everything with AI, and we don’t want people to lose their jobs. But sometimes, we are so busy pretending that AI is completely useless that we forget that it actually isn’t completely useless. The reason they’re so dangerous is that they’re not completely useless.

    • ag10n@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      1
      ·
      1 day ago

      It’s almost as if nuance and context matters.

      How much energy does a human use to write a Wikipedia article? Now also measure the accuracy and completeness of the article.

      Now do the same for AI.

      Objective metrics are what is missing, because much of what we hear is “phd-level inference” and it’s still just a statistical, probabilistic generator.

      https://www.pcmag.com/news/with-gpt-5-openai-promises-access-to-phd-level-ai-expertise

    • snooggums@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 day ago

      It is completely useless as presented by the major players who atrocities trying to jam models that are trying to everything at the same time and that is what we always talk about when discussing AI.

      We aren’t talking about focused implementations that are Wikipedia to a certain set of data or designed for specific purposes. That is why we don’t need nuance, although the reminder that we aren’t talking about smaller scale AI used by humans as tools is nice once in a while.