Off-and-on trying out an account over at @[email protected] due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 27 Posts
  • 3.94K Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle



  • I have, in the past, kind of wished that settings and characters could not be copyrighted. I realize that there’s work that goes into creating each, but I think that we could still live in a world where those weren’t protected and interesting stuff still gets created. If that were to happen, then I agree, it’d be necessary to make it very clear who created what, since the setting and characters alone wouldn’t uniquely identify the source.

    Like, there are things like Greek mythology or the Robin Hood collection of stories, very important works of art from our past, that were created by many different unaffiliated people. They just couldn’t be created today with our modern stories, because the settings and characters would be copyrighted and most rightsholders don’t just offer a blanket grant of rights to use them.

    That’s actually one unusual and notable thing H.P. Lovecraft did — if you’ve ever seen stuff in the Cthulhu Mythos, that’s him. He encouraged anyone who wanted to do so to create stuff using his universe. One reason why we have that kind of collection of Lovecraftian stuff.

    But you can’t do that with, say, Star Wars or a lot of other beloved settings.






  • There is a class of products that consist of a hardware box that you ram your network traffic moving between different business locations in a company through that tries to accelerate this traffic. F5 is one manufacturer of them. One technique these use is to have private key material such that they can pretend to be the server at the other end of a TLS connection — that’s most of the “encrypted” traffic that you see on the Internet. If you go to an “https” URL in your Web browser, you’re talking TLS, using an encrypted connection. They can then decode the traffic and use various caching and other modification techniques on the decoded information to reduce the amount of traffic moving across the link and to reduce effective latency, avoid transferring duplicate information, etc. Once upon a time, when there was a lot less encrypted traffic in the world, you could just do this by working on cleartext data, but over time, network traffic have increasingly become encrypted. Many such techniques become impossible with encrypted traffic. So they have to be able to break the encryption on the traffic, to get at the cleartext material.

    The problem is that to let this box impersonate such a server so that it can get at the unencrypted traffic, they have to have a private key that permits them to impersonate the real server. Having access to this key is also interesting to an attacker, because it would similarly let them impersonate the real server, which would let them view or modify network traffic in transit. If one could push new, malicious software up to control these boxes, one could steal these keys, which would be of interest to attackers in attacking other systems.

    It sounds, to my brief skim, like attackers got control of the portion of F5’s internal network that is involved with building and distributing software updates to these boxes.

    The problem is that if you’re a sysadmin at, say, General Dynamics (an American defense contractor which, from a quick search, apparently uses these products from F5), you may have properly secured your servers internal to the company in all ways…but then the network admin basically let another box, which wasn’t properly secured, into the encrypted communications between your inter-office servers on the network. It could extract information from your encrypted communication streams, or modify it. God only knows what important data you’ve been shoveling across those connections, or what you’ve done with information that you trusted to remain unmodified while crossing such a connection. It’s be a useful tool for an attacker to stick all sorts of new holes into customer networks that are harder to root out.


  • It definitely is bad, but it may not be as bad as I thought above.

    It sounds like they might actually just be relying on certificates pre-issued by a (secured) CA for specific hosts to MITM Web traffic to specific hosts, and they might not be able to MITM all TLS traffic, across-the-board (i.e. their appliance doesn’t get access to the internal CA’s private key). Not sure whether that’s the case — that’s just from a brief skim — and I’m not gonna come up to speed on their whole system for this comment, but if that’s the case, then you’d still be able to attack probably a lot of traffic going to theoretically-secured internal servers if you manage to get into a customer network and able to see traffic (which compromising the F5 software updates would also potentially permit for, unfortunately) but hopefully you wouldn’t be able to hit, say, their VPN traffic.


  • F5 said a “sophisticated” threat group working for an undisclosed nation-state government had surreptitiously and persistently dwelled in its network over a “long-term.” Security researchers who have responded to similar intrusions in the past took the language to mean the hackers were inside the F5 network for years.

    This could be really bad. F5 produces WAN accelerators, and one feature that those can have is to have X.509 self-signed certificates used by corporate internal CAs stored on them — things that normally, you’d keep pretty damned secure — to basically “legitimately” perform MITM attacks on traffic internal to corporate networks as part of their normal mode of operation.

    Like, if an attacker could compromise F5 Networks and get a malicious software update pushed out to WAN accelerators in the field to exfiltrate critical private keys from companies, that could be bad. You could probably potentially MITM their corporate VPNs. If you get inside a customer’s network, it’d probably let you get by a lot of their internal security.

    kagis

    Yeah, it sounds like that is exactly what they hit. The “BIG-IP” stuff apparently does this:

    During that time, F5 said, the hackers took control of the network segment the company uses to create and distribute updates for BIG IP, a line of server appliances that F5 says is used by 48 of the world’s top 50 corporations

    https://techdocs.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/ltm-implementations-11-5-1/10.html

    MyF5 Home > Knowledge Centers > BIG-IP LTM > BIG-IP Local Traffic Manager: Implementations > Managing Client and Server HTTPS Traffic using a Self-signed Certificate

    One of the ways to configure the BIG-IP system to manage SSL traffic is to enable both client-side and server-side SSL termination:

    • Client-side SSL termination makes it possible for the system to decrypt client requests before sending them on to a server, and encrypt server responses before sending them back to the client. This ensures that client-side HTTPS traffic is encrypted. In this case, you need to install only one SSL key/certificate pair on the BIG-IP system.
    • Server-side SSL termination makes it possible for the system to decrypt and then re-encrypt client requests before sending them on to a server. Server-side SSL termination also decrypts server responses and then re-encrypts them before sending them back to the client. This ensures security for both client- and server-side HTTPS traffic. In this case, you need to install two SSL key/certificate pairs on the BIG-IP system. The system uses the first certificate/key pair to authenticate the client, and uses the second pair to request authentication from the server.

    This implementation uses a self-signed certificate to authenticate HTTPS traffic.

    Well. That…definitely sucks.


  • For example, its not only illegal for someone to make and sell known illegal drugs, but its additionally illegal to make or sell anything that is not the specifically illegal drug but is analogous to it in terms of effect (and especially facets of chemical structure)

    Hmm. I’m not familiar with that as a legal doctrine.

    kagis

    At least in the US — and this may not be the case everywhere — it sounds like there’s a law that produces this, rather than a doctrine. So I don’t think that there’s a general legal doctrine that would automatically apply here.

    https://en.wikipedia.org/wiki/Federal_Analogue_Act

    The Federal Analogue Act, 21 U.S.C. § 813, is a section of the United States Controlled Substances Act passed in 1986 which allows any chemical “substantially similar” to a controlled substance listed in Schedule I or II to be treated as if it were listed in Schedule I, but only if intended for human consumption. These similar substances are often called designer drugs. The law’s broad reach has been used to successfully prosecute possession of chemicals openly sold as dietary supplements and naturally contained in foods (e.g., the possession of phenethylamine, a compound found in chocolate, has been successfully prosecuted based on its “substantial similarity” to the controlled substance methamphetamine).[1] The law’s constitutionality has been questioned by now Supreme Court Justice Neil Gorsuch[2] on the basis of Vagueness doctrine.

    But I guess that it might be possible to pass a similar such law for copyright, though.


  • If this is the cave I’m thinking of, it’s huge.

    kagis

    I don’t know if there is a “Phong Nha cave”, but the one I’m thinking of is in Phong Nha-Kẻ Bàng National Park, so I’m guessing that that’s it.

    https://en.wikipedia.org/wiki/Hang_Sơn_Đoòng

    Sơn Đoòng cave (Vietnamese: hang Sơn Đoòng, IPA: [haːŋ1 ʂɤːn1 ɗɔ̤ŋ2]), in Phong Nha-Kẻ Bàng National Park, Quảng Trị Province, Vietnam, is the world’s largest natural cave.[1]

    I think that the above image is from a sinkhole where the roof opens up partway through it.

    There’s some YouTube video I saw a while back that did flybys of 3D models.



  • So, the “don’t use copyrighted data in a training corpus” crowd probably isn’t going to win the IP argument. And I would be quite surprised if IP law changes to accommodate them.

    However, the “don’t generate and distribute infringing material” is a whole different story. IP holders are on pretty solid ground there. One thing that I am very certain that IP law is not going to permit is just passing copyrighted data into a model and then generating and distributing material that would otherwise be infringing. I understand that anime rightsholders often have something of a tradition of sometimes letting fan-created material slide, but if generative AI massively reduces the bar to creating content, I suspect that that is likely to change.

    Right now, you have generative AI companies saying — maybe legally plausibly — that they aren’t the liable ones if a user generates infringing material with their model.

    And while you can maybe go after someone who is outright generating and selling material that is infringing, something doesn’t have to be commercially sold to be infringing. Like, if LucasArts wants to block for-fun fan art of Luke and Leia and Han, they can do that.

    One issue is attribution. Like, generative AI companies are not lying when they say that there isn’t a great way to just “reverse” what training corpus data contributed more to an output.

    However, I am also very confident that it is very possible to do better than they do today. From a purely black-box standpoint, one possibility would be, for example, to use TinEye-style fuzzy hashing of images and then try to reverse an image, probably with a fuzzier hash than TinEye uses, to warn a user that they might be generating an image that would be derivative. That won’t solve all cases, especially if you do 3d vision and generative AI producing models (though then you could also maybe do computer vision and a TinEye-equivalent for 3D models).

    Another complicating factor is that copyright only restricts distribution of derivative works. I can make my own, personal art of Leia all I want. What I can’t do is go distribute it. I think — though I don’t absolutely know what case law is like for this, especially internationally — that generating images on hardware at OpenAI or whatever and then having them move to me doesn’t count as distribution. Otherwise, software-as-a-service in general, stuff like Office 365, would have major restrictions on working with IP that locally-running software would not. Point is that I expect that it should be perfectly legal for me to go to an image generator and generate material as long as I do not subsequently redistribute it, even if it would be infringing had I done so. And the AI company involved has no way of knowing what I’m doing with the material that I’m generating. If they block me from making material with Leia, that’s an excessively-broad restriction.

    But IP holders are going to want to have a practical route to either be able to go after the generative AI company producing the material that gets distributed, or the users generating infringing material and then distributing it. AI companies are probably going to say that it’s the users, and that’s probably correct. Problem is from a rightsholder standpoint, yeah, they could go after the users before, but if it’s a lot cheaper and easier to create the material now, that presents them with practical problems. If any Tom, Dick, and Harry can go out and generate material, they’ve got a lot more moles to whack in their whack-a-mole game.

    And in that vein, an issue that I haven’t seen come up is what happens if generative AI companies start permitting deterministic generation of content – that is, where if I plug in the same inputs, I get the same outputs. Maybe they already do; I don’t know, run my generative AI stuff locally. But supposing you have a scenario like this:

    • I make a game called “Generic RPG”, which I sell.

    • I distribute — or sell — DLC for this game. This uses a remote, generative AI service to generate art for the game using a set of prompts sold as part of the DLC for that game. No art is distributed as part of the game. Let’s say I call that “Adventures A Long Time Ago In A Universe Far, Far Away” or something that doesn’t directly run afoul of LucasArts, creates enough distance. And let’s set aside trademark concerns, for the sake of discussion. And lets say that the prompts are not, themselves infringing on copyright (though I could imagine them doing so, let’s say that they’re sufficiently distant to avoid being derivative works).

    • Every user buys the DLC, and then on their computer, reconstitutes the images for the game. At least if done purely-locally, this should be legal under case law — the GPL specifically depends on the fact that one can combine material locally to produce a derivative work as long as one does not then distribute it. Mods to (copyrighted) games can just distribute the deltas, producing a derivative work when the mod is applied, and that’s definitely legal.

    • One winds up with someone selling and distributing what is effectively a “Star Wars” game.

    Now, maybe training the model on images of Star Wars content so that it knows what Star Wars looks like isn’t, as a single step, creating an infringing work. Maybe distributing the model that knows about Star Wars isn’t infringement. Maybe the prompts being distributed designed to run against that model are not infringing. Maybe reconstituting the apparently-Star-Wars images in a deterministic fashion using SaaS to hardware that can run the model is not infringing. But if the net effect is equivalent to distributing an infringing work, my suspicion is that courts are going to be willing to create some kind of legal doctrine that restricts it, if they haven’t already.

    Now, this situation is kind of contrived, but I expect that people will do it, sooner or later, absent legal restrictions.




  • Well, it’s easy enough to check, but I imagine that it does. Most countries don’t honestly have that many peacetime soldiers (though wartime is going to affect things). If you’re North Korea, maybe, as they have a very large military.

    kagis

    https://en.wikipedia.org/wiki/United_States_Armed_Forces

    In the US, even if you counted every single active-duty person in all of the armed forces as a “soldier” — which I’m sure is not actually the case — you’d have 1.3 million people.

    The smallest category in the chart I listed was office clerks, at 2.5 million.

    goes looking for North Korea

    https://en.wikipedia.org/wiki/Korean_People's_Army

    North Korea also has 1.3 million active-duty people in its military, but it has a far-smaller population than the US does, 26 million instead of 340 million. So in North Korea, if you counted “uniformed services” as one profession, it’d be easily higher than any occupation as a percentage-of-population than those listed above for the US. The largest category for the US — the home and health care aides — has 4 million, so 1% of the population. The active-duty military would be 5% of the population for North Korea.