• Stefan_S_from_H@discuss.tchncs.deOP
    link
    fedilink
    arrow-up
    36
    arrow-down
    1
    ·
    9 days ago

    A few months ago, someone asked Google for countries that start with an H. In German.

    It listed Hungary, which is called “Ungarn” in German. OK, an understandable mistake. But then it gave the additional information that Hungary sometimes gets called Holland.

    Two things upset me:

    1. People believe the answers.
    2. Nobody is talking about all the stupid mistakes AI is making. It should have all stopped after LLM AIs thought blueberry has 3 Bs in it.
    • FishFace@piefed.social
      link
      fedilink
      English
      arrow-up
      13
      ·
      9 days ago

      Nobody is talking about all the stupid mistakes AI is making. It should have all stopped after LLM AIs thought blueberry has 3 Bs in it.

      I never hear about anything else

    • TheRealKuni@piefed.social
      link
      fedilink
      English
      arrow-up
      10
      ·
      9 days ago

      LLMs are really bad with letters, and in my limited understanding that’s because they don’t see words as strings of letters, they see them as tokens. It’s all numbers by the time the LLM is processing it.

      • exasperation@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        2
        ·
        8 days ago

        We think in terms of tokens, too, but we have the ability to look under the hood at some of how our knowledge is constructed.

        For the typical literate English speaker, we seamlessly pronounce certain letter combinations as different from the component parts (like ch, sh, ph, or looking ahead to see if the syllable ends in an E to decide how to pronounce the vowel in the middle). Then, entire words or phrases have a single meaning that doesn’t get broken apart. Similarly, people who are fluent in multiple languages, including languages that use the same script (e.g., latin letters), can look at the whole string of text to quickly figure out which language they’re reading, and consult that part of their knowledge base.

        And usually our brains process things completely separately from how we read or write text. Even the question of asking how many r’s are in “raspberry” requires us to go and count, because it isn’t inherent in the knowledge we have at the tip of tongue. Someone can memorize a speech but not know how many times the word “the” appears in it, even if their knowledge contains all the information necessary to answer the question.

        Even if we are actively thinking in the context of how words are constructed, like doing crosswords, these things tend to be more fun when mixed with other modes of thinking: Wordle’s mix of both logic and spelling, a classic crossword’s clever style of hints, etc.

        Manipulation of letters is simply one mode of thinking. We’re really good at seamlessly switching between modes.