Hi all, i am quite an old fart, so i just recently got excited about self hosting an AI, some LLM…

What i want to do is:

  • chat with it
  • eventually integrate it into other services, where needed

I read about OLLAMA, but it’s all unclear to me.

Where do i start, preferably with containers (but “bare metal”) is also fine?

(i already have a linux server rig with all the good stuff on it, from immich to forjeio to the arrs and more, reverse proxy, Wireguard and the works, i am looking for input on AI/LLM, what to self host and such, not general selfhosting hints)

  • ragingHungryPanda@piefed.keyboardvagabond.com
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 day ago

    not for LLMs. I have a 16GB and even what I can fit in there just isn’t really enough to be useful. It can still do things and quickly enough, but I can’t fit models that large enough to be useful.

    I also don’t know if your GPU is compatible with ROCM or not.

    • eleitl@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      12 hours ago

      The GPU used to but they dropped ROCm support for Radeon V and VII some time ago. Have to look at that Strix Halo/AI Max thing I guess.