Off-and-on trying out an account over at @[email protected] due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 29 Posts
  • 4.22K Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle
  • The malware continuously monitors its access to GitHub (for exfiltration) and npm (for propagation). If an infected system loses access to both channels simultaneously, it triggers immediate data destruction on the compromised machine. On Windows, it attempts to delete all user files and overwrite disk sectors. On Unix systems, it uses shred to overwrite files before deletion, making recovery nearly impossible.

    shred is intended to overwrite the actual on-disk contents by overwriting data in the file prior to unlinking the files. However, shred isn’t as effective on journalled filesystems, because writing in this fashion doesn’t overwrite the contents on-disk like this. Normally, ext3, ext4, and btrfs are journalled. Most people are not running ext2 in 2025, save maybe on their /boot partition, if they have that as a separate partition.



  • If you mean distributing inference across many machines, each of which could not individually deal with a large model, using today’s models, not viable with reasonable performance. The problem is that you require a lot of bandwidth between layers; a lot of data moves. When you cluster current systems, you tend to use specialized, high-bandwidth links.

    It might theoretically be possible to build models that are more-amenable to this sort of thing, that have small parts of a model run on nodes that have little data interchange between them. But until they’re built, hard to say.

    I’d also be a little leery of how energy-efficient such a thing is, especially if you want to use CPUs — which are probably more-amenable to be run in a shared fashion than GPUs. Just using CPU time “in the background” also probably won’t work as well as with a system running other tasks, because the limiting factor isn’t heavy crunching on a small amount of data — where a processor can make use of idle cores without much impact to other tasks — but bandwidth to the memory, which is gonna be a bottleneck for the whole system. Also, some fairly substantial memory demands, unless you can also get model size way down.



  • I wonder how much exact duplication each process has?

    https://www.kernel.org/doc/html/latest/admin-guide/mm/ksm.html

    Kernel Samepage Merging

    KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y, added to the Linux kernel in 2.6.32. See mm/ksm.c for its implementation, and http://lwn.net/Articles/306704/ and https://lwn.net/Articles/330589/

    KSM was originally developed for use with KVM (where it was known as Kernel Shared Memory), to fit more virtual machines into physical memory, by sharing the data common between them. But it can be useful to any application which generates many instances of the same data.

    The KSM daemon ksmd periodically scans those areas of user memory which have been registered with it, looking for pages of identical content which can be replaced by a single write-protected page (which is automatically copied if a process later wants to update its content). The amount of pages that KSM daemon scans in a single pass and the time between the passes are configured using sysfs interface

    KSM only operates on those areas of address space which an application has advised to be likely candidates for merging, by using the madvise(2) system call:

    int madvise(addr, length, MADV_MERGEABLE)
    

    One imagines that one could maybe make a library interposer to induce use of that.






  • I’ve also noticed that is you want a chest smaller than DDD, it’s almost impossible with some models — unless you specify that they are a gymnast.

    That’s also another point of present generative AI image weakness — humans have an intuitive understanding of relative terms and can iterate on them.

    So, it’s pretty easy for me to point at an image and ask a human artist to “make the character’s breasts larger” or “make the character’s breasts smaller”. A human artist can look at an image, form a mental model of the image, and produce a new image in their head relative to the existing one by using my relative terms “larger” and “smaller”. They can then go create that new image. Humans, with their sophisticated mental model of the world, are good at that.

    But we haven’t trained an understanding of relative relationships into diffusion models today, and doing so would probably require a more sophisticated — maybe vastly more sophisticated — type of AI. “Larger” and “smaller” aren’t really usable as things stand today. Because breast size is something that people often want to muck with, people have trained models on a static list of danbooru tags for breast sizes, and models trained on those can use them as inputs, but even then, it’s a relatively-limited capability. And for most other properties of a character or thing, even that’s not available.

    For models which support it, prompt term weighting can sometimes provide a very limited analog to this. Instead of saying “make the image less scary”, maybe I “decrease the weight of the token ‘scary’ by 0.1”. But that doesn’t work with all relationships, and the outcome isn’t always fantastic even then.


  • There are also things that present-day generative AI is not very good at in existing fields, and I’m not sure how easy it will be to address some of those. So, take the furry artist. It looks like she made a single digitally-painted portrait of a tiger in a suit, a character that she invented. That’s something that probably isn’t all that hard to do with present-day generative AI. But try using existing generative AI to create several different views of the same invented character, presented consistently, and that’s a weak point. That may require very deep and difficult changes on the technology front to try to address.

    I don’t feel that a lot of this has been hashed out, partly because a lot of people, even in the fields, don’t have a great handle on what the weaknesses are and what might be viably remedied and how on the AI front. Would be interesting to try to do some competitions in various areas, see what a competent person in the field and someone competent in using generative AI could do. It’ll probably change over time, and techniques will evolve.

    There are areas where generative AI for images has both surpassed what I expected and underperformed. I was pretty impressed with its ability to capture the elements of what creates a “mood”, say, and make an image sad or cheerful. I was very surprised at how effective current image generation models were, given their limited understanding of the world, at creating things “made out of ice”. But I was surprised at how hard it was to get any generative AI model I’ve tried to generate drawings containing crosshatching, which is something that plenty of human artists do just fine. Is it easy to address that? Maybe. I think I could give some pretty reasonable explanations as to why consistent characters are hard, but I don’t really feel like I could offer a convincing argument about why crosshatching is, don’t really understand why models do poorly with it, and thus, I’ve no idea how hard it might be to remedy that.

    Some fantastic images are really easy to create with generative image AI. Some are surprisingly difficult. To name two things that I recall [email protected] regulars have run into over the past couple years, trying to create colored car treads (it looks like black treads are closely associated with the “tire” token) and trying to create centaurs (generative AI models want to do horses or people, not hybrids). The weaknesses may be easy to remedy or hard, but they won’t be the same weaknesses that humans have; these are things that are easy for a human. Ditto for strengths — it’s relatively-easy for generative AI to create extremely-detailed images (“maximalist” was a popular token that I recall seeing in many early prompts) or to replicate images of natural media that are very difficult or time-consuming to work in in the real world, and those are areas that aren’t easy for human artists.




  • 4chan’s position is that they aren’t doing business in the UK, which is why they’re disregarding the UK regulator’s fines. The UK regulator might be able to block them in the UK if the UK rolls out a Great Firewall of the UK, say, a la China, but probably not get the US to enforce rulings against them. And, I’d add, such a Great British Firewall is going to have limited impact unless the Brits also ban VPNs in the UK that don’t also do such blocking internal to the VPN and additionally block external VPNs, a la Russia.

    In the same way, lemmy.today is doing business in the EU.

    Very unlikely, in the eyes of the US court system. They have no EU physical presence, and aren’t advertising targeting EU people.

    Facebook

    Yeah, now they might be affected, but they’re in the EU.

    EDIT: For context, last year, this happened:

    https://www.nbcnews.com/news/world/russia-fines-google-20-decillion-world-gdp-youtube-kremlin-war-ukraine-rcna178172

    Russia fines Google more than the world’s entire GDP

    Russian courts can hand down whatever rulings they want, but they don’t really have an effect elsewhere unless other legal systems view them as having jurisdiction.

    Iran has the death penalty for blasphemy. But the US isn’t going to enforce rulings on blasphemy unless it views Iran as having jurisdiction over the person posting said content.



  • I used Reddit for a long time, since the extremely early days of the site, back when most of the content was posted by Reddit staff and there was really just one page.

    While I wasn’t enthralled with the move from old.reddit.com to the new reddit.com, the site was at least still accessible via the old interface, absent a minor quirk here and there in how Markdown was interpreted, and different ways of customizing subreddit appearance. That wasn’t enough to cause me to leave.

    What did it for me was that I expected that when they moved from their growth phase to monetization phase that they’d make some changes that I wouldn’t like, but I didn’t expect them to end access for third-party clients, which was not okay with me.



  • Micron is one of the “Big Three” DRAM manufacturers.

    Crucial is their “sell directly to consumers” brand.

    https://netvaluator.com/en/top-10-ram-manufacturers-by-market-share/

    Micron Technology stands as the third giant, with a market share close to 20%, or about 23 billion USD in DRAM revenue. Unlike Samsung and SK Hynix, Micron is headquartered in the United States, making it a critical supplier for Western markets. Its product portfolio covers both DRAM and NAND, giving it broader exposure to the memory industry.

    The company’s consumer-facing Crucial brand is well recognized among PC builders and gamers worldwide. Micron also plays a vital role in supplying DRAM for servers and AI, competing directly in the HBM space. Its strategy focuses on quality, diversification, and maintaining a stable supply chain for North America and Europe. As the only American giant, Micron is strategically important in the geopolitical landscape of semiconductors.