• brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    29
    arrow-down
    8
    ·
    edit-2
    7 hours ago

    If you think of LLMs as an extra teammate, there’s no fun in managing them either. Nurturing the personal growth of an LLM is an obvious waste of time. Micromanaging them, watching to preempt slop and derailment, is frustrating and rage-inducing.

    Finetuning LLMs for niche tasks is fun. It’s explorative, creative, cumulitive, and scratches a ‘must optimize’ part of my brain. It feels like you’re actually building and personalizing something, and teaches you how they work and where they fail, like making any good program or tool. It feels you’re part of a niche ‘old internet’ hacking community, not in the maw of Big Tech.

    Using proprietary LLMs over APIs is indeed soul crushing. IMO this is why devs who have to use LLMs should strive to run finetunable, open weights models where they work, even if they aren’t as good as Claude Code.

    But I think most don’t know they exist. Or had a terrible experience with terrible ollama defaults, hence assume that must be what the open model ecosystem is like.

    • BlameThePeacock@lemmy.ca
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      9
      ·
      7 hours ago

      Improving your input, and the system message can also be part of that. There are multiple optimizations available for these systems that people aren’t really good at yet.

      It’s like watching Grandma google “Hi, I’d like a new shirt” back in the day and then having her complain that she’s getting absolutely terrible search results.

      • brucethemoose@lemmy.world
        link
        fedilink
        arrow-up
        9
        arrow-down
        2
        ·
        edit-2
        7 hours ago

        Mmmmm. Pure “prompt engineering” feels soulless to me. And you have zero control over the endpoint, so changes on their end can break your prompt at any time.

        Messing with logprobs and raw completion syntax was fun, but the US proprietary models took that away. Even sampling is kind of restricted now, and primitive compared to what’s been developed in open source.