FFmpeg to Google: Fund Us or Stop Sending Bugs

als@lemmy.blahaj.zone · 2 days ago

FFmpeg to Google: Fund Us or Stop Sending Bugs

solrize@lemmy.ml · 2 days ago

If there’s a vulnerability in the codec, then someone can slip a malicious file onto some web site and use it as an exploit. It’s not only about some 30 year old game. It might be appropriate for ffmpeg to get rid of such obscure codecs, or sandbox them somehow so RCE’s can’t escape from them, even at an efficiency cost. Yes though, Google funding or even a Summer of Code sponsorship would be great.

moonpiedumplings@programming.dev · 1 day ago

It might be appropriate for ffmpeg to get rid of such obscure codecs

This is why compilation flags exist. You can compile software to not include features, and the code is removed, decreasing the attack surface. But it’s not really ffmpegs job to tell you which compilation flags you should pick, that is the responsibility of the people integrating and deploying it into the systems (Google).

Sandbox them somehow so RCE’s can’t escape from them, even at an efficiency cost

This is similar to the above. It’s not really ffmpeg’s job to pick a sandboxing software (docker, seccomp, selinux, k8s, borg, gvisor, kata), but instead the responsibility of the people integrating and deploying the software.

That’s why it’s irritating when these companies whine about stuff that should be handled by the above two practices, asking for immediate fixes via their security programs. Half of our frustration is them asking for volunteers to fix CVE’s with a score less than a 6 promptly (but while simultaneously being willing to contribute fixes or pay for CVE’s with greater scores under their bug bounty programs). This is a very important thing to note. In further comments, you seem to be misunderstanding the relationship Google and ffmpeg have here: Google’s (and other companies’) security program is apply pressure to fix the vulnerabilities promptly. This is not the same thing as “Here’s a bug, fix it at your leisure”. Dealing with this this pressure is tiring and burns maintainers out.

The other half is when they reveal that their security practices aren’t up to par when they whine about stuff like this and demand immediate fixes. I mean, it says it in the article:

Thus, as Mark Atwood, an open source policy expert, pointed out on Twitter, he had to keep telling Amazon to not do things that would mess up FFmpeg because, he had to keep explaining to his bosses that “They are not a vendor, there is no NDA, we have no leverage, your VP has refused to help fund them, and they could kill three major product lines tomorrow with an email. So, stop, and listen to me … ”

Anyway, the CVE being mentioned has been fixed, if you dig into it: https://xcancel.com/FFmpeg/status/1984178359354483058#m

But it really should have been fixed by Google, since they brought it up. Because there is no real guarantee that volunteers will fix it again in the future, and burnt out volunteers will just quit instead. Libxml decided to just straight up stop doing responsible disclosure because they got tired of people asking for them to fix vulnerabilities with free labor, and put all security issues as bug reports that get fixed when maintainers have the time instead.

The other problem is that the report was AI generated, and part of the issue here is that ffmpeg (and curl, and a few other projects), have been swamped with false positives. These AI, generate a security report that looks plausible, maybe even have a non working POC. This wastes a ton of volunteer time, because they have to spend a lot of time filtering through these bug reports and figuring out what’s real and what is not.

So of course, ffmpeg is not really going to prioritize the 6.0 CVE when they are swamped with all of these potentially real “9.0 UlTrA BaD CrItIcAl cVe” and have to figure out if any of them are real first before even doing work on them.

solrize@lemmy.ml · edit-2 16 hours ago

This is why compilation flags exist. You can compile software to not include features, and the code is removed, decreasing the attack surface. But it’s not really ffmpegs job to tell you which compilation flags you should pick, that is the responsibility of the people integrating and deploying it into the systems (Google).

ffmpeg in fact comes with a default makefile that excludes a bunch of modules for various reasons, some for license incompatibility issues and some because they’re not considered to be of production grade (so theyre released for testing etc.). It’s nuts to suggest continuing to ship something with known vulnerabilities without, at minimum, removing it from the default build and labelling it as having known issues. If you don’t have the resources to fix the bug that’s understandable, but own up to it and tell people to be careful with that module.

AI generated

AI tools were apparently used for locating the bugs but the reports were real and legit.

But it really should have been fixed by Google, since they brought it up.

It would be great if Google could fix it, but ffmpeg is very hard to work in, not just because of the code organization but because of the very specialized knowledge needed to mess around inside a codec. It would be simpler and probably better for Google to contribute development funding since they depend on the software so heavily.

You might remember libav, a now long dead ffmpeg fork from some years back. It was well intentioned and had good programmers involved, but it just couldn’t handle the technical demands of developing something like ffmpeg. ffmpeg is a messed up project in some ways but it’s extremely impressive. Being able to find bugs (say by fuzzing) is much different from being able to fix them sanely. If you’re google and have infinite hardware like they do, you can do more fuzzing than anyone else, and that’s valuable to any project with this type of exposure.

It’s not really ffmpeg’s job to pick a sandboxing software (docker, seccomp, selinux, k8s, borg, gvisor, kata),

Those approaches would be ridiculous bloat, the idea is just supply some kind of wrapper that runs the codec in a chrooted separate process communicating through pipes under ptrace control or however that’s done these days.

but instead the responsibility of the people integrating and deploying the software.

The ffmpeg CLI tool is in fact an integration of the software and should probably use the wrapper, at least for suspect modules. Note that all major web browsers already do something like this and have done it for years, for exactly this reason.

moonpiedumplings@programming.dev · 15 hours ago

AI tools were apparently used for locating the bugs but the reports were real and legit.

Yes, but the FFMPEG developers do not know this until after they triage all the bug reports they are getting swamped with. If Google really wants a fix for their 6.0 CVE immediately (because again, part of the problem here was google’s security team was breathing down the necks of the maintainers), then google can submit a fix. Until then, fffmpeg devs have to work on figuring out if any more criticial looking issues they receive, are actually critical.

It’s nuts to suggest continuing to ship something with known vulnerabilities without, at minimum,

Again, the problem is false positive vulnerabilities. “9.0 CVE’s” (that are potentially real) must be triaged before Google’s 6.0 CVE.

It would be great if Google could fix it, but ffmpeg is very hard to work in, not just because of the code organization but because of the very specialized knowledge needed to mess around inside a codec. It would be simpler and probably better for Google to contribute development funding since they depend on the software so heavily.

Except google does fix these issues and contribute funding. Summer of code, bug bounties, and other programs piloted by Google contribute both funding and fixes to these programs. We are mad because Google has paid for more critical issues in the past, but all of a sudden they are demanding free labor for medium severity security issues from swamped volunteers.

Being able to find bugs (say by fuzzing

Fuzzing is great! But Google’s Big Sleep project is GenAI based. Fuzzing is in the process, but the inputs and outputs are not significantly distinct from the other GenAI reports that ffmpeg receives.

Those approaches would be ridiculous bloat, the idea is just supply some kind of wrapper that runs the codec in a chrooted separate process communicating through pipes under ptrace control or however that’s done these days.

Chroot only works on Linux/Unix and requires root to use, making it not work in rootless environments. Every single sandboxing software comes with some form of tradeoff, and it’s not ffmpeg’s responsibilities to make those decisions for you or your organization.

Anyway, sandboxing on Linux is basically broken when it comes to high value targets like google. I don’t want to go into detail, but but I would recommend reading maidaden’s insecurities (I mentioned gvisor earlier because gvisor is google’s own solution to flaws in existing linux sandboxing solutions). Another problem is that ffmpeg people care about performance a lot more than security, probably. They made the tradeoff, and if you want to undo the tradeoff, it’s not really their job to make that decision for you. It’s not such a binary, but more like a sliding scale, and “secure enough for google” is not the same as “secure enough for average desktop user”.

I saw earlier you mentioned google keeping vulnerabilities secret, and using them against people or something like that, but it just doesn’t work that way lmao. Google is such a large and high value organization, that they essentially have to treat every employee as a potential threat, so “keeping vulns internal” doesn’t really work. Trying to keep a vulnerability internal will 100% result in it getting leaked and then used against them.It would be great if Google could fix it, but ffmpeg is very hard to work in, not just because of the code organization but because of the very specialized knowledge needed to mess around inside a codec. It would be simpler and probably better for Google to contribute development funding since they depend on the software so heavily.

It’s nuts to suggest continuing to ship something with known vulnerabilities without, at minimum, removing it from the default build and labelling it as having known issues. If you don’t have the resources to fix the bug that’s understandable, but own up to it and tell people to be careful with that module.

You have no fucking clue how modern software development and deployment works. Getting rid of all CVE’s is actually insanely hard, something that only orgs like Google can reasonably do, and even Google regularly falls short. The vast majority of organizations and institutions have given up on elimination of CVE’s from the products they use. “Don’t ship software with vulnerabilities” sounds good in a vacuum, but the reality is that most people simply settle for something secure enough for their risk level. I bet you if you go through any piece of software on your system right now you can find CVE’s in it.

~~You don’t need to outrun a hungry bear, you just need to outrun the person next to you~~ Cybersecurity is about risk management, not risk elimination. You can’t afford risk elimination.

solrize@lemmy.ml · edit-2 11 hours ago

Yes, but the FFMPEG developers do not know this until after they triage all the bug reports they are getting swamped with.

With a concrete bug report like “using codec xyz and input file f3 10 4d 26 f5 0a a1 7e cd 3a 41 6c 36 66 21 d8… ffmpeg crashes with an oob memory error”, it’s pretty simple to confirm that such a crash happens. The hard part is finding the cause and fixing it. I had understood the bug search to be fuzzing controlled by AI so I referred to it as fuzzing. Apparently though the AI is also writing the bug report now, so yeah ok, maybe there is potential slop there.

“Don’t ship software with vulnerabilities” sounds good in a vacuum,

I said KNOWN vulnerabilities. Make it known vulnerabilities without known mitigations if you prefer.

I wrote a few of those GNU coreutils that the Rusties are now rewriting. I don’t remember hearing of any CVE’s connected with any of them, though that is mostly because they are uncomplicated.

Here’s all the Debian security advisories for the past year or so. There aren’t THAT many, and they are mostly in complicated network programs, the Linux kernel, etc. Also a lot aren’t actual vulns: https://www.debian.org/security/

This was the first search hit about ffmpeg cve’s, from June 2024 so not about the current incident. It lists four CVE’s, three of them memory errors (buffer overflow, use-after-free), and one off-by-one error. The class of errors in the first three is supposedly completely eliminated by Rust. No idea about the fourth. Not claiming that a Rust reimplementation of ffmpeg is anywhere near feasible. Dunno if the current set of CVE’s are comparable but it’s a likely guess. Anyway, as SPJ likes to say about Haskell’s type system, the idea is to stop fixing bugs one by one, and instead eliminate entire classes of bugs. We can’t fix everything but we can certainly do better than we are doing now.

I saw earlier you mentioned google keeping vulnerabilities secret, and using them against people or something like that,

That was listed as an example of what not to do, not a proposal of an approach to take.

moonpiedumplings@programming.dev · 10 hours ago

With a concrete bug report like “using codec xyz and input file f3 10 4d 26 f5 0a a1 7e cd 3a 41 6c 36 66 21 d8… ffmpeg crashes with an oob memory error”, it’s pretty simple to confirm that such a crash happens

Google’s big sleep was pretty good, it gave a python program that generated an invalid file. It looked plausible, and it was a real issue. The problem is that literally every other generative AI bug report also looks equally as plausible. As I mentioned before, curl is having a similar issue.

And here’s what the lead maintainer of curl has to say:

Stenberg said the amount of time it takes project maintainers to triage each AI-assisted vulnerability report made via HackerOne, only for them to be deemed invalid, is tantamount to a DDoS attack on the project.

So you can claim testing may be simple, but it looks like that isn’t the case. I would say one of the problems is that all these people are volunteers, so they probably have a very, very limited set of time to spend on these projects.

This was the first search hit about ffmpeg cve’s, from June 2024 so not about the current incident. It lists four CVE’s, three of them memory errors (buffer overflow, use-after-free), and one off-by-one error. The class of errors in the first three is supposedly completely eliminated by Rust.

FFMpeg is not just C code, but also large portions of handwritten, ultra optimized assembly code (per architecture, too…). You are free to rewrite it in rust if you so desire, but I stated it above and will state it again: ffmpeg made the tradeoff of performance for security. Rust currently isn’t as performant as optimized C code, and I highly doubt that even unsafe rust can beat hand optimized assembly — C can’t, anyways.

(Google and many big tech companies like ultra performant projects because performance equals power savings equals costs savings at scale. But this means weaker security when it comes to projects like ffmpeg…)

solrize@lemmy.ml · edit-2 10 hours ago

Have any of the google-submitted vulnerability reports turned out to be invalid? Project Zero was pretty well regarded.

Yes I know about the asm code in ffmpeg though IDK if it’s doing anything that could lead to a use after free error. I can understand if an OOB reference happens in the asm code since codecs are full of lookup tables and have to jump around inside video frames for motion estimation, but I’d hope no dynamic memory allocation is happening there. Instead it would be more like a GPU kernel. But, I haven’t examined any of it.

Anyway there’s a big difference between submitting concrete input data that causes an observable crash, and sending a pile of useless spew from a static analyzer and saying “here, have fun”. The Curl guy was getting a lot of absolute crap submitted as reports.

From the GCC manual “bug criteria” section:

If the compiler gets a fatal signal, for any input whatever, that is a compiler bug. Reliable compilers never crash.

That sounds like timelessly good advice to me.

moonpiedumplings@programming.dev · 9 hours ago

Project Zero

Project zero was entirely humans though, no GenAI. Project big sleep has been reliable so far, but there is no real reason for ffmpeg developers to value project big sleeps 6.0 CVE’s over potentially real more critical CVEs. The problem is that Google’s security team would still be breathing down the necks of these developers and demanding fixes for the vulns they submitted, which is kinda BS when they aren’t chipping in at all.

Anyway there’s a big difference between submitting concrete input data that causes an observable crash, and sending a pile of useless spew from a static analyzer and saying “here, have fun”

Nah, the actually fake bug reports also often have fake “test cases”. That’s what makes the LLM generated bug reports so difficult to deal with.

solrize@lemmy.ml · 9 hours ago

6.0 is pretty serious according to the rubric. Are there some worse ones? Yes Google is acting obnoxious per your description. It makes no sense to me that they are balking about supplying some funds. They used to be fairly forthcoming with such support.

I can imagine a CI system for bug reports where you put in the test case and it gets run under the real software to confirm whether an error results, if one has been claimed. No error => invalid test case.

TehPers@beehaw.org · edit-2 2 days ago

The issue is not whether security issues exist in ffmpeg. It’s clear that vulnerabilities need to be fixed.

The issue is with who actually fixes them. Your last sentence is the core of it. Google can submit as many bug reports as they want, but they better be willing to ensure the bugs get fixed too.

Midnitte@beehaw.org · 2 days ago

If it’s a mission critical library, then the corporations should be willing to shell out money to ensure critical bugs are fixed.

Google can’t have their cake and eat it too.

solrize@lemmy.ml · 2 days ago

Google having found the bugs can either submit bug reports or quietly sit on them, or even exploit them as spyware, among other ideas. Whether they fund ffmpeg is a completely separate question. I can see how the 90 day disclosure window can be a problem if the number of reports is high.

TehPers@beehaw.org · 2 days ago

Bug reports that apply only to Google’s services or which surface only because of them are bugs Google needs to fix. They can and do submit bug reports all they want. Nobody is obligated to fix them.

The other part of this is, of course, disclosure. Google’s disclosure of these bugs discredits ffmpeg developers and puts the blame on them if they fail to fix the vulnerabilities. They can acknowledge the project as being a volunteer, hobby project created by others if they want, and they can treat it like that. But if they’re doing that, they should not be putting responsibilities on them.

If Google wants to use ffmpeg, they can. But a bug in ffmpeg that affects Google’s services is a bug in Google’s service. It is not the responsibility of unpaid volunteers to maintain their services for them.

solrize@lemmy.ml · edit-2 1 day ago

I don’t understand how a bug is supposed to know whether it’s triggered inside or outside of a google service. If the bug can only be triggered in some weird, google-specific deployment, that’s one thing, but I don’t think that’s what we’re talking about here. If the bug is manifestly present in ffmpeg and it’s discovered at google, what are you saying is supposed to happen? Google should a) report it under the normal 90 day disclosure rule; b) report it but let it stay undisclosed for longer than normal, due to the resource contraints ffmpeg’s devs areunder; c) not report it and let some attacker exploit it? (b) might have some merit but (c) is insane. Once some bad actor finds out about the bug (through independent discovery or any other way), it’s going to be exploited. That might already be happening before even google finds the bug.

FFmpeg’s codebase and dev community are both exceptionally difficult and that is not helping matters, I’m sure.

There are a bunch of Rust zealots busily rewriting GNU Coreutils which in practice have been quite reliable and not that badly in need of rewriting. Maybe the zealots should turn their attention to ffmpeg (a bug minefield of long renown) instead.

Alternatively (or in addition), some effort should go into sandboxing ffmpeg so its bugs can be contained.

TehPers@beehaw.org · edit-2 1 day ago

I don’t understand how a bug is supposed to know whether it’s triggered inside or outside of a google service.

Who found the bug, and what triggered it? Does it affect all users, or does it only affect one specific service that uses it in one specific way due to a weird, obscure set of preconditions or extraordinarily uncommon environment configuration?

Most security vulnerabilities in projects this heavily used are hyper obscure.

If the bug is manifestly present in ffmpeg and it’s discovered at google, what are you saying is supposed to happen?

e) Report it with the usual 90 day disclosure rule, then fix the bug, or at least reduce the burden as much as possible on those who do need to fix it.

Google is the one with the vulnerable service. ffmpeg itself is a tool, but the vast majority of end users don’t use it directly, therefore the ffmpeg devs are not the ones directly (or possibly at all) affected by the bug.

There are a bunch of Rust zealots busily rewriting GNU Coreutils which in practice have been quite reliable and not that badly in need of rewriting. Maybe the zealots should turn their attention to ffmpeg (a bug minefield of long renown) instead.

This is weirdly offtopic, a gross misrepresentation of what they are doing, and horribly dismissive of the fact that every single person being discussed who is doing the real work is not being paid support fees by Google. Do not dictate what they should do with their time until you enter a contract with them. Until that point, what they do is none of your business.

Alternatively (or in addition), some effort should go into sandboxing ffmpeg so its bugs can be contained.

And who will do this effort?

solrize@lemmy.ml · 16 hours ago

Do not

Beginning a sentence with “do not” make it clear who is trying to dicate.

TehPers@beehaw.org · 15 hours ago

Wow genius response. Glad you actually addressed my comment.