It’s by a Chinese company, and collects telemetry on its users via Umeng+, which is a Beijing-based analytics company. Even though it’s open source, the code is large enough that it’s hard to tell if there is anythinf compromising in there from the Chinese government, and/or whether/what data collected by Umeng+ is making it to the Chinese government.
Does wonders to find anything, but you need to know what you’re looking for. I’d probably look for DNS names that end in government or China specific TLDs to start with.
I know you’re be facetious here and I’m ignorant to actual application security methodology. I do have to ask though, when you are looking for something in code that could be a security risk, isn’t it possible to look for methods or functions used to lookup DNS, outbound network calls, or even libraries used to obfuscate code? It seems to me that most programmers wouldn’t go through lengths to obfuscate their code and would want it to be readable/maintainable, so doing so would be a red flag.
Obviously no one is going to search for “evil spyware” when auditing code. Your point stands it is not as simple as that.
You’re totally right, I just think you underestimate how long it takes to rigorously audit a whole codebase. Let’s say you look for outvound network calls. Now you need to figure out for all of them whether they are malicious, which will require specific domain knowledge. And that’s assuming you find them all, the network call could be hidden away in a dependency. None of this is impossible but it requires a serious effort.
it’s trivial to break that approach by obfuscating strings. You can do things like using base64 encoded strings in the source code, building strings from smaller component parts, or using rot13 on, say, the host component of a URI. That last one could be pretty interesting if you, as a threat actor, owned both permutations. The hostname (minus TLD) in the source code could be the nice, human readable version (www.happysite.org) that appears to be something legit. Then, when you rot13 it to www.uncclfvgr.org, traffic is sent to the evil site doing scary things. People can be far more tricksy than that. There’s also the whole issue around whether or not the binaries you’re running actually match the code in the repo. The xz kerfuffle showed how much can be hidden that way.
EDIT: I should make it clear that I don’t use Deepin or the DE it provides because I only use WMs with no desktop, so the distro and DE are of no interest to me. I don’t know if it’s a security hazard or not, I have no horse in this fight.
It’s by a Chinese company, and collects telemetry on its users via Umeng+, which is a Beijing-based analytics company. Even though it’s open source, the code is large enough that it’s hard to tell if there is anythinf compromising in there from the Chinese government, and/or whether/what data collected by Umeng+ is making it to the Chinese government.
It’s unfortunate, because I really like the DE. Real stand out. If it were more trustworthy, it’d be my first choice.
Apparently you can download it from the AUR if you really want to.
So I guess the backdoor is buried DeepIn the code
I mean a simple
grep -r “string” *
Does wonders to find anything, but you need to know what you’re looking for. I’d probably look for DNS names that end in government or China specific TLDs to start with.
grep -r "evil spyware" *
nothing? awesome, I guess this software is safe to use. Let’s gooo
I know you’re be facetious here and I’m ignorant to actual application security methodology. I do have to ask though, when you are looking for something in code that could be a security risk, isn’t it possible to look for methods or functions used to lookup DNS, outbound network calls, or even libraries used to obfuscate code? It seems to me that most programmers wouldn’t go through lengths to obfuscate their code and would want it to be readable/maintainable, so doing so would be a red flag.
Obviously no one is going to search for “evil spyware” when auditing code. Your point stands it is not as simple as that.
You’re totally right, I just think you underestimate how long it takes to rigorously audit a whole codebase. Let’s say you look for outvound network calls. Now you need to figure out for all of them whether they are malicious, which will require specific domain knowledge. And that’s assuming you find them all, the network call could be hidden away in a dependency. None of this is impossible but it requires a serious effort.
it’s trivial to break that approach by obfuscating strings. You can do things like using base64 encoded strings in the source code, building strings from smaller component parts, or using rot13 on, say, the host component of a URI. That last one could be pretty interesting if you, as a threat actor, owned both permutations. The hostname (minus TLD) in the source code could be the nice, human readable version (www.happysite.org) that appears to be something legit. Then, when you rot13 it to www.uncclfvgr.org, traffic is sent to the evil site doing scary things. People can be far more tricksy than that. There’s also the whole issue around whether or not the binaries you’re running actually match the code in the repo. The xz kerfuffle showed how much can be hidden that way.
EDIT: I should make it clear that I don’t use Deepin or the DE it provides because I only use WMs with no desktop, so the distro and DE are of no interest to me. I don’t know if it’s a security hazard or not, I have no horse in this fight.
There are so many ways to obfuscate things that your approach won’t work.