I wanna know if MATRIX recipients know my IP, and more globally what the recipients know about me (how the matrix protocol works). THX
I wanna know if MATRIX recipients know my IP, and more globally what the recipients know about me (how the matrix protocol works). THX
So why some nerds saying matrix as a metadata disaster?
Because there is a lot more metadata than just IP addresses.
Because encryption doesn’t work for rooms over 50 people, so any room over that size is public by default. And most of the usage is the Matrix.org home server.
Even if I selfhost?
If you self-host, it’s better, but it’s still not great. The people would then know the IP address of your server that you were hosting it on, so you’d have to make sure it was a VPS and not done from home.
You could also put it behind a cloudflare proxy subdomain, right? That way it looks like the origin ip comes from cloudflare
Ugh, Yes, you could. But, Cloudflare.
What about using a normal, non-Cloudflare VPS for this?
By public you mean non-encrypted? How does that work? When you create a room, you default to encryption, and there is only one participant (the room creator). And you cannot turn off encryption, so what then happens when you get 51 participants?
Also existing non-encrypted rooms are never automatically switched to encryption, so the switch must be explicit. Does it refuse to do it if there are more than 50 participants?
I’ve never heard of this limit nor was I able to find info about it (so a link would be great), but there could some factor that increases problems as the number of people increases… Perhaps 50 is some practical suggestion for the maximum number of people to have in encrypted sessions?
I got the 50 from this video.
https://iv.datura.network/watch?v=W8KEuAEYjQ4
Thanks!
The mention was at about 12:06, in the form that OLM breaks down at about 50 users “give or take”, so it’s not really a limitation imposed by the system itself and it would be difficult to impose it. I doubt this is the experience of all Matrix e2ee users at least at that exact point, but e2ee has always had some growth pains, so there could people with those issues; on the other hand few large rooms are e2ee to begin with, so experience on those is limited. E2ee also requires the users to be more mindful about their data as in not to lose their private keys, and these problems probably increase linearly as the room size increases.
I didn’t notice any claim of rooms larger than 50 becoming public.
I’ve only heard a second-hand info about it, but apparently one local policital party uses e2ee in Matrix with hundreds of people in the room, so that should be a proof that the encryption is not limited to 50 users—and this info sounds just as well founded as the information provided by the video ;).
The guy carries on stating that pretty much all of the huge matrix rooms are not end-to-end-encrypted, and I have no reason to doubt that. Personally I see little point in having such large rooms encrypted anyway, because if you have a large room you will also likely have very relaxed checks on who gets to enter it (e.g. it could be completely public), and if that’s the case, then so can any party who wishes to monitor the room join the room as well. E2ee won’t be protecting those cases. (While at the same time you lose server-side search feature and efficient notifications, though at least the latter one is being fixed with out-of-envelope notification data—which again leaks a bit more metadata…)
The video also makes it sound like that if you have a Matrix Home Server in the network, it’s going to end up hosting CSAM. This is only the case if one of the users of that HS are in a room that has the content, so it’s not like it will just automatically get migrated there. I imagine vast majority of Matrix Home Servers have limited account creation abilities (e.g. companies, personal home servers, organizations, etc), eliminating or at least highly discouraging this kind of issue.
Btw, the video makes an excellent point about the Matrix CDN issue, which is being fixed currently as well (that change is already merged to the matrix spec), by requiring authentication. Next steps is going to associate media to messages, making this kind of thing even more strict. All this means IRC bridges will need to start hosting Matrix-side contents by themselves, though…
Because whatever server you’re registered on or communicating with has ALL the metadata…
Human behavior is funny, isn’t it? No matter what the topic, there are always people around who like to repeat criticism they heard from someone else, even if it’s so vague as to be useless (“metadata disaster”) or they don’t understand the details at all.
It’s not a disaster. A few minor bits of metadata (avatars and reactions, IIRC) haven’t been moved into the encrypted part of the protocol yet. If that’s a problem for your use case, then you might want to choose a platform with different flaws, or simply avoid those features. It’s already good enough for the needs of many privacy-minded folks, though, and it continues to get better.
There is a lot more metadata than just avatars and reactions. Accounts and their room membership over time, timing of messages (and thus online times), individual interactions between specific users (based on the timing of their messages) and so on. That is all in the unencrypted metadata of a Matrix room and can’t be moved to the encrypted message part like avatars and reactions.
The network layer of all internet servers reveals almost everything you listed. Signal has the same problem, and there’s nothing they can do about that. The only way to avoid it is to use a completely peer-to-peer model (Matrix has started work on this, btw) and avoid communicating across network routes that can be monitored.
There might be one exception, depending on what you mean by “Accounts”: The user IDs participating in a room can be seen by server operators and room members. But then again, server operators can already see their users’ IP addresses (which is arguably more sensitive than a user ID), and I believe room members have to be allowed into the room in order to see them. For most of us, that’s fine. Far from a disaster.
No, because Matrix stores all this info and gives it freely to other servers retroactively(!). Also with network layer sniffing (which is anyway much harder to do) you can only see which home-server talked to with other homeserver and what clients talked to their homeserver. If you have the full room meta-data you can easily make a social graph of which account talked to whom when and where.
Can you show me the part of the spec that allows a server with no room members to get private room info from another server? I’m skeptical, but if true, I believe that would be worth reporting as a bug.
You’re funny.
Obviously you need someone joining the room for the room metadata to be shared between homeservers. But that is really only a minor barrier and once that has happened the worst case scenario takes place immediately. On other messengers (federated or not) a newly joining member has very limited access to past room metadata. Not so with Matrix, where a joining homeserver get full retroactive access to all the room metadata since the room’s creation. If you can’t see the problem with that, you really need to stop privacy LARPing 🙄
Well then, your assertion that Matrix gives it freely is false.
This is false, too. Historical event visibility is controlled by a room setting. (And if you don’t trust admins of a sensitive room to configure for privacy, then you’re going to have bigger problems, no matter what platform it’s on.)
Edit: I suppose you might argue that you can bypass this by running your own homeserver and attempting to join the room from it, thereby granting visibility not through joining (as you wrote), but instead through federation with the server you control. The thing is, you can’t do it without permission. Room admins can simply deny your join request when they see what server you’re on. This might make sense in a particularly sensitive room, for example, just as it would to restrict history visibility.
LARPing? I’m not the one stirring up drama with falsehoods and patronizing snark, am I? Farewell.
My point is that it should never give out that data, or even store it permanently in the first place. This is just a fundamentally bad design from a privacy perspective, and other messengers don’t do that.
This is not false, what you mean only hides it for normal users, but it still ends up in the database of all participating homeservers and all the admins of those have full access to it. I happen to run a Matrix homeserver myself…
It’s not a disaster. That’s overstating it. It just leaks some metadata to the server. Nothing that’s inherently wrong with it and which won’t be solved over time.
Some may don’t like that everything is stored on the server compared to signal where it only transits the server. But for companies or gov that should be/is mandatory. And it makes handling cross client and updating devices a lot easier for normal consumers.
You seem to be unaware of how Matrix works. It is inherent to the protocol that room metadata is shared with other servers. It is not fixable as it is working as intended. This feature is nice for censorship resistance, but it is pretty much a nightmare for metadata privacy.
I’m mot aware of a critical metadata leak, a link or example would be really helpful. Thanks!
Like all of it. It is not a “leak” if it is working as intended.
Anyone can spin up a Matrix server, join a room with it and the Matrix network will happily push a complete copy of the room metadata (all the way back to the point the room was first created) to that new homeserver.
There’s no problem for a public room. You can’t just join a private room.
Yes it is a problem for both public and private rooms as this info is stored and shared retroactively. Lets say one of the participants of a private room gets compromised or you invite someone that has their account on a compromised homeserver. This then results in the entire room meta-data history (since the room was created) being shared with that compromised homeserver which can then easily analyse it in detail.
That doesn’t sound realisticly threatening to me. Besides, if I want the highest security and privacy I use onion routing.
lol, why are you even posting on a privacy community then? And using Tor doesn’t help at all in that case.
The problem is that the vast majority of Matrix users are on the same server. So if you have a public or private room, most likely one of them will join your room from said server, thereby collecting and uploading all your room data to said server. Because of this, their servers probably contain the metadata of 99% of the Matrix-verse. Which is then very interesting to, and highly susceptible to, dragnet operations from various 3-letter agencies.
Realistically it’s not a huge threat, and a huge step up from some other platforms, but it is undesirable for sure.
a public room is public. anyone could and should be able to enter it at any moment start recording and uploading everything to $terrorist@/or$three-letter-agency or such. The idea that someone else could also get the same already public data later is not threatening, as that data is already considered public as in “everyone in the world could have it a second after the data came into existance”. and also as removing from the public is not considered possible, uploading that already intentionally published data again does not pose a greater threat than its first publication, but uses just a bit of bandwidth, not more. if you are very sensitive about visibility of who you talk with, maybe don’t enter “public” rooms in the first place.
if you join a private room, you already want to share with the other participants that you are f***ing talking to them, including when and who you exactly encrypted the data for, when, and to which servers they have to be forwarded. i expect the server of all participants to forward messages to the recipients. for this the server needs to know this type of information. Of course awareness, which data is used to make i.e. routing decisions is a good thing, but a “nightmare” would be teams zoom icq, whatsapp and similar. i am sure that messengers exist that could be less traceable for participants, but full anonymity to who you are communicating with so that even the servers know nothing about what happens in a room is imho not even a goal of matrix for the future.
Not a “nightmare”, but what a nightmare it must be to find out that a system that looked so promising did not fulfill “every” dreamexpectation one had with options that are even the opposite of ones dreamexpectation like “public rooms”. that are meant to be public! how horrible!!!(lol)
by the way -as it seems possibly noteworthy here - if you exchange emails with someones @gmail address, then google has all of your mail histories metadata, as well as the server of your provider has. just to mention, do not send emails to @gmail.com if you dislike google knowing about it. and if you share a document with edit history, then the edit history is likely also shared ;-) As “rooms” in matrix are meant to have a state that changes from the beginning sometimes possibly with every message and one can answer to a message which would reveal the existance of that message later when answered on, including at least a hint of what it was about, such information is imho meant to to be rather complete than hidden. maybe 1:1 chat solves this issue for you, as every chat with a new other person would start empty.
i might be wrong, but matrix already is one of the most robust systems when it comes to “compromised servers”. so very far away from a nightmare. that is unless you are either a true criminal bastard or a true world saving hero, then every leaked byte might be the deadly one, that is true.
So in case you are a true world saving hero: Maybe use a self build raspberry pi mesh proxy chain mounted on rooftops delivered by drones at night to proxy the signal of an in-memory-only-tasks-raspi to a free wifi, where the raspi that has its orders is using battery (like the rooftop proxy chain) but is hidden in a public transport to reach the proxy mesh by the transportations timetable. just to give a paranoic one some ideas and some work to do ;-) If you’ve build everything, then upload the code to github and designs to thingiverse so that “anyone” could have placed the proxy mesh to a free wifi on the rooftops, so you be more secure from beeing suspected ;-) lol btw a mesh system to accomplish this already exists, i think they named it b.a.t.m.a.n. (no joke) protocol, so the main struggle should be handling of solar power vs wifi signal strength, distances, humidity and windproof mount design beeing able to be deployed by manually controlled quadrocopters. good luck!