• Thorry@feddit.org
      link
      fedilink
      English
      arrow-up
      50
      ·
      9 hours ago

      Also just because the code works, doesn’t mean it’s good code.

      I’ve had to review code the other day which was clearly created by an LLM. Two classes needed to talk to each other in a bit of a complex way. So I would expect one class to create some kind of request data object, submit it to the other class, which then returns some kind of response data object.

      What the LLM actually did was pretty shocking, it used reflection to get access from one class to the private properties with the data required inside the other class. It then just straight up stole the data and did the work itself (wrongly as well I might add). I just about fell of my chair when I saw this.

      So I asked the dev, he said he didn’t fully understand what the LLM did, he wasn’t familiar with reflection. But since it seemed to work in the few tests he did and the unit tests the LLM generated passed, he thought it would be fine.

      Also the unit tests were wrong, I explained to the dev that usually with humans it’s a bad idea to have the person who wrote the code also (exclusively) write the unit tests. Whenever possible have somebody else write the unit tests, so they don’t have the same assumptions and blind spots. With LLMs this is doubly true, it will just straight up lie in the unit tests. If they aren’t complete nonsense to begin with.

      I swear to the gods, LLMs don’t save time or money, they just give the illusion they do. Some task of a few hours will take 20 min and everyone claps. But then another task takes twice as long and we just don’t look at that. And the quality suffers a lot, without anyone really noticing.

      • Kissaki@feddit.org
        link
        fedilink
        English
        arrow-up
        2
        ·
        47 minutes ago

        So I asked the dev, he said he didn’t fully understand what the LLM did, he wasn’t familiar with reflection.

        Big baffling facepalm moment.

        If they would at least prefix the changeset description with that it’d be easier to interpret and assess.

      • criss_cross@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        50 minutes ago

        They’ve been great for me at optimizing bite sized annoying tasks. They’re really bad at doing anything beyond that. Like astronomically bad.

        • Kissaki@feddit.org
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          45 minutes ago

          They did say why they’re doing it

          Whenever possible have somebody else write the unit tests, so they don’t have the same assumptions and blind spots.

          Did that not make sense to you?

          I usually wouldn’t do that, because it’s a bigger investment. But it certainly makes logical sense to me and is something teams can weigh and decide on.

      • airgapped@piefed.social
        link
        fedilink
        English
        arrow-up
        7
        ·
        6 hours ago

        Great description of a problem I noticed with most LLM generated code of any decent complexity. It will look fantastic at first but you will be truly up shit creek by the time you realise it didn’t generate a paddle.

      • WaitThisIsntReddit@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        11 hours ago

        A couple agent iterations will compile. Definitely won’t do what you wanted though, and if it does it will be the dumbest way possible.

        • TORFdot0@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          arrow-down
          1
          ·
          edit-2
          11 hours ago

          Yeah you can definitely bully AI into giving you some thing that will run if you yell at it long enough. I don’t have that kind of patience

          Edit: typically I see it just silently dump errors to /dev/null if you complain about it not working lol

          • Darkenfolk@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            9 hours ago

            And people say that AI isn’t humanlike. That’s peak human behavior right there, having to bother someone out of procrastination mode.

            The edit makes it even better, swiping things under the rug? Hell yeah!