Off-and-on trying out an account over at @[email protected] due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 36 Posts
  • 4.53K Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle




  • I believe that “older” mods can remove other mods, same as on Reddit, though I’ve never tried. So mods that show up higher on the list of mods in the right-hand sidebar in the Lemmy Web UI for the community.

    Or instance admins on the instance where the community lives. They probably won’t get involved unless the mod is violating the rules they’ve set for their instance. Your idea of what constitutes acceptable behavior for the mod and their idea may or may not be the same.

    You’d have to talk to either those “more senior” mods or the instance admins and convince one of them that the mod shouldn’t be a mod for that community.

    Only alternative is going and creating some alternative community elsewhere and drawing users away.



  • Unless you have some really serious hardware, 24 billion parameters is probably the maximum that would be practical for self-hosting on a reasonable hobbyist set-up.

    Eh…I don’t know if you’d call it “really serious hardware”, but when I picked up my 128GB Framework Desktop, it was $2k (without storage), and that box is often described as being aimed at the hobbyist AI market. That’s pricier than most video cards, but an AMD Radeon RX 7900 XTX GPU was north of $1k, an NVidia RTX 4090 was about $2k, and it looks like the NVidia RTX 5090 is presently something over $3k (and rising) on EBay, well over MSRP. None of those GPUs are dedicated hardware aimed at doing AI compute, just high-end cards aimed at playing games that people have used to do AI stuff on.

    I think that the largest LLM I’ve run on the Framework Desktop was a 106 billion parameter GLM model at Q4_K_M quantization. It was certainly usable, and I wasn’t trying to squeeze as large a model as possible on the thing. I’m sure that one could run substantially-larger models.

    EDIT: Also, some of the newer LLMs are MoE-based, and for those, it’s not necessarily unreasonable to offload expert layers to main memory. If a particular expert isn’t being used, it doesn’t need to live in VRAM. That relaxes some of the hardware requirements, from needing a ton of VRAM to just needing a fair bit of VRAM plus a ton of main memory.


  • Are Motorola ok?

    Depends on what you value in a phone. Like, I like a vanilla OS, a lot of memory, large battery, and a SIM slot. I don’t care much about the camera quality and don’t care at all about size and weight (in fact, if someone made a tablet-sized phone, I’d probably switch to that). That’s almost certainly not the mix that some other people want.

    There’s some phone comparison website I was using a while back that has a big database of phones and lets you compare and search based on specification.

    goes looking

    This one:

    https://www.phonearena.com/phones



  • That’s why they have the “Copilot PC” hardware requirement, because they’re using an NPU on the local machine.

    searches

    https://learn.microsoft.com/en-us/windows/ai/npu-devices/

    Copilot+ PCs are a new class of Windows 11 hardware powered by a high-performance Neural Processing Unit (NPU) — a specialized computer chip for AI-intensive processes like real-time translations and image generation—that can perform more than 40 trillion operations per second (TOPS).

    It’s not…terribly beefy. Like, I have a Framework Desktop with an APU and 128GB of memory that schlorps down 120W or something, substantially outdoes what you’re going to do on a laptop. And that in turn is weaker computationally than something like the big Nvidia hardware going into datacenters.

    But it is doing local computation.


  • I’m kind of more-sympathetic to Microsoft than to some of the other companies involved.

    Microsoft is trying to leverage the Windows platform that they control to do local LLM use. I’m not at all sure that there’s actually enough memory out there to do that, or that it’s cost-effective to put a ton of memory and compute capacity in everyone’s home rather than time-sharing hardware in datacenters. Nor am I sold that laptops — which many “Copilot PCs” are — are a fantastic place to be doing a lot of heavyweight parallel compute.

    But…from a privacy standpoint, I kind of would like local LLMs to be at least available, even if they aren’t as affordable as cloud-based stuff. And at least Microsoft is at least supporting that route. A lot of companies are going to be oriented towards just doing AI stuff in the cloud.


  • You only need one piece of (timeless) advice regarding what to look for, really: if it looks too good to be true, it almost certainly is. Caveat emptor.

    I mean…normally, yes, but because the situation has been changing so radically in such a short period of time, it probably is possible to get some bonkers deals in various niches, because the market hasn’t stabilized yet.

    Like, a month and a half back, in early December, when prices had only been going up like crazy for a little while, I was posting some tiny retailers that still had RAM in stock at pre-price-increase rates that I could find on Google Shopping. IIRC the University of Virginia bookstore was one, as they didn’t check that purchasers were actually students. I warned that they’d probably be cleaned out as soon as scalpers got to them, and that if someone wanted memory, they should probably get it ASAP. Some days prior to that, there was a small PC parts store in Hawaii that had some (though that was out of stock by the next time I was looking and mentioned the bookstore).

    That’s not to disagree with the point that @[email protected] is making, that this was awfully sketchy as a source, or your point that scavenging components off even a non-scam piece of secondhand non-functional hardware is risky. But in times of rapid change, it’s not impossible to find deals. In fact, it’s various parties doing so that cause prices to stabilize — anyone selling memory for way below market price is going to have scalpers grab it.


  • I don’t think that memory manufacturers are in some plot to promote SaaS. It’s just that they can make a ton of money off the demand right now for AI buildout, and they’re trying to make as much money as they can in the limited window that they have. All kind of industries are going to be collateral damage for a while. Doesn’t require a more complicated explanation.

    Michael Crichton had some way of putting “it’s not about you” it in Sphere that I remember liking.

    searches

    “I’m afraid that’s true,” Norman said. “The sphere was built to test whatever intelligent life might pick it up, and we simply failed that test.”

    “Is that what you think the sphere was made for?” Harry said. “I don’t.”

    “Then what?” Norman said.

    “Well,” Harry said, “look at it this way: Suppose you were an intelligent bacterium floating in space, and you came upon one of our communication satellites, in orbit around the Earth. You would think, What a strange, alien object this is, let’s explore it. Suppose you opened it up and crawled inside. You would find it very interesting in there, with lots of huge things to puzzle over. But eventually you might climb into one of the fuel cells, and the hydrogen would kill you. And your last thought would be: This alien device was obviously made to test bacterial intelligence and to kill us if we make a false step.

    “Now, that would be correct from the standpoint of the dying bacterium. But that wouldn’t be correct at all from the standpoint of the beings who made the satellite. From our point of view, the communications satellite has nothing to do with intelligent bacteria. We don’t even know that there are intelligent bacteria out there. We’re just trying to communicate, and we’ve made what we consider a quite ordinary device to do it.”

    Like, two years back, there was a glut of memory in the market. Samsung was losing a lot of money. They weren’t losing money back then because they were trying to promote personal computer ownership any more than they’re trying to deter personal computer ownership in 2026. It’s just that demand can gyrate more-rapidly than production capacity can adjust.


  • I’m not really a hardware person, but purely in terms of logic gates, making a memory circuit isn’t going to be hard. I mean, a lot of chips contain internal memory. I’m sure that anyone that can fabricate a chip can fabricate someone’s memory design that contains some amount of memory.

    For PC use, there’s also going to be some interface hardware. Dunno how much sophistication is present there.

    I’m assuming that the catch is that it’s not trivial to go out and make something competitive with what the PC memory manufacturers are making in price, density, and speed. Like, I don’t think that if you want to get a microcontroller with 32 kB of onboard memory, that it’s going to be a problem. But that doesn’t really replace the kind of stuff that these guys are making.

    EDIT: The other big thing to keep in mind is that this is a short-term problem, even if it’s a big problem. I mean, the problem isn’t the supply of memory over the long term. The problem is the supply of memory over the next couple of years. You can’t just build a factory and hire a workforce and get production going the moment that someone decides that they want several times more memory than the world has been producing to date.

    So what’s interesting is really going to be solutions that can produce memory in the near term. Like, I have no doubt that given years of time, someone could set up a new memory manufacturer and facilities. But to get (scaled-up) production in a year, say? Fewer options there.








  • https://stackoverflow.com/questions/30869297/difference-between-memfree-and-memavailable

    Rik van Riel’s comments when adding MemAvailable to /proc/meminfo:

    /proc/meminfo: MemAvailable: provide estimated available memory

    Many load balancing and workload placing programs check /proc/meminfo to estimate how much free memory is available. They generally do this by adding up “free” and “cached”, which was fine ten years ago, but is pretty much guaranteed to be wrong today.

    It is wrong because Cached includes memory that is not freeable as page cache, for example shared memory segments, tmpfs, and ramfs, and it does not include reclaimable slab memory, which can take up a large fraction of system memory on mostly idle systems with lots of files.

    Currently, the amount of memory that is available for a new workload, without pushing the system into swap, can be estimated from MemFree, Active(file), Inactive(file), and SReclaimable, as well as the “low” watermarks from /proc/zoneinfo.

    However, this may change in the future, and user space really should not be expected to know kernel internals to come up with an estimate for the amount of free memory.

    It is more convenient to provide such an estimate in /proc/meminfo. If things change in the future, we only have to change it in one place.

    Looking at the htop source:

    https://github.com/htop-dev/htop/blob/main/MemoryMeter.c

       /* we actually want to show "used + shared + compressed" */
       double used = this->values[MEMORY_METER_USED];
       if (isPositive(this->values[MEMORY_METER_SHARED]))
          used += this->values[MEMORY_METER_SHARED];
       if (isPositive(this->values[MEMORY_METER_COMPRESSED]))
          used += this->values[MEMORY_METER_COMPRESSED];
    
       written = Meter_humanUnit(buffer, used, size);
    

    It’s adding used, shared, and compressed memory, to get the amount actually tied up, but disregarding cached memory, which, based on the above comment, is problematic, since some of that may not actually be available for use.

    top, on the other hand, is using the kernel’s MemAvailable directly.

    https://gitlab.com/procps-ng/procps/-/blob/master/src/free.c

    	printf(" %11s", scale_size(MEMINFO_GET(mem_info, MEMINFO_MEM_AVAILABLE, ul_int), args.exponent, flags & FREE_SI, flags & FREE_HUMANREADABLE));
    

    In short: You probably want to trust /proc/meminfo’s MemAvailable, (which is what top will show), and htop is probably giving a misleadingly-low number.