Why is metadata detrimental to my privacy ?
TLDR: we recommend you use SimpleX as they solved the metadata problem
In this tutorial we're going to explain the importance of privacy, as well as how metadata kills your privacy. By the end of this tutorial, you will also know the most private way to communicate online.
Introduction
In an age where government surveillance has become an uprising, people are starting to turn to more private and secure methods of navigating the digital world. So the question remains: what is an actually private and secure way to communicate online?
Before the actual tutorial starts, we will first need to understand the core concepts we will be discussing.
What is Privacy ?
In simple words, privacy is the ability to choose what information or personal data is hidden and who it's hidden from. In the digital age, privacy extends beyond physical space to include everything from online communication to personal data stored on the cloud. Thus meaning that no third party, whether a government agency or a corporation, should have unrestricted access to your personal data without your consent.
Let's say that Bob wants to tell something to Alice without anyone knowing.

Here the fact that Jack can't hear the contents of Alice and Bob's conversation is akin to end-to-end encryption (E2EE). Only Alice and Bob can be aware of the contents of their conversation. Although, depending on Jack's level of access to the conversation's metadata (who's talking, when, how often are they talking, etc), then Jack may able to guess what Bob and Alice are talking about.
Now, let's say that Jack does have access to the metadata of the conversation.

Instead of a classic brick wall separating Jack with Bob and Alice, there is a glass wall. Jack can now see who is talking to whom (Bob and Alice), how long the conversation is, when and where it takes place. This gives Jack a significantly higher chance of successfully guessing what Bob and Alice are talking about.
Why is Privacy Important ?
But why is privacy even important?
Privacy is important because it protects individual autonomy, allowing people to make decisions and express themselves freely without fear of judgment or surveillance. This is especially relevant in today's world.
The E2EE misconception
Many people are under the impression that if a chat is end to end encrypted, it automatically guarantees privacy and security. However, this is a common misconception. Think of it like having a conversation in a glassbox. While the content of your conversation may be encrypted and unreadable to the outsiders, the metadata surrounding the conversation is still visible.
This means that while your messages are protected, the details about who you're talking to, when and other information can still be observed. Encryption only safeguards the content, leaving the metadata exposed.

Our Main Problem: Metadata in Communications
What is Metadata ?
Metadata is essentially data about data. In the digital world, it refers to information that gives insights to your activities. This includes:
-
Who: the sender and the recipient β Phone number, Username, Email, IP Address
-
When: Timestamp of when the communication took place β Time of day, Duration of the interaction
-
Where: Information about where the sender and recipient were β GPS Location, IP Address
-
How: The device or platform used for communication β Operating System, App Version, Device Type
-
Size: The size of the communication or the file sent β message length, file attachments
... just to name a few.

What even is its purpose ?
Metadata wasn't inherently made for bad actors to gain information on you. It was actually made to help with organization, searchability, retrieval and management. For example, a photo you may take on your camera can contain info like your camera type, when and where it was taken, file size, making it easier to find and use later.

Now, apart from that, it can also be used to identify and deanonymize you.
Now that we understand metadata, letβs talk about why itβs a major threat to privacy.
Why is Metadata Detrimental to Your Privacy ?
Metadata has compromised privacy in the past, and it will continue to do so in the future.
Let's say Bob takes a photo on a coffee shop, a park and his home. Bob excitedly shares these photos on his public blog page and is eager to take some more. What Bob hasn't revealed in his blog page is, that he is an admin for a popular drug site on the dark net.
The police, having identified his blog page, is trying to get information out of it. Bob, having uploaded pictures with metadata in his peripheral location, has given the police a silent entrance to his home.
After a thorough metadata analysis, the police pinpoints Bob's location and storms his home and arrests him. Don't be like Bob.

While this may be a made up story, here's a real one:
In 2019, the famous Strava fitness app accidentally revealed the locations of secret military bases around the world. The security personnel that worked there, having to maintain their fitness, went on runs near the classified military bases and recorded their activity on Strava. The running activities were then displayed on Strava's heatmap, resulting in the leak of secret military base locations.

The point is that metadata kills: it can give away more about your activities than you want it to.
So, with that being said, how can we design communication systems to protect against metadata exposure?
Necessary Design Choices to Limit Metadata in Chat Applications
The first and most important design choice is: no PII (personal identifiable information) needed to sign up.
If a chat application requires a phone number to register, then you can forget about using it anonymously, and using it for sensitive activities is of course out of the question altogether.
Additionally, as you noticed before, there must be no metadata collection at all! If the application collects metadata, then you're also going to forget about ever using it for anonymous use.
To further evaluate this, let's look at a case from when Santa Clara County issued a subpoena to Signal. Signal was able to confirm metadata such as the user's phone number, login dates and other relevant details.

As you can see, chatting apps like Signal can and will share metadata about you to law enforcement when needed, making metadata a vulnerability to anonymity.
Last but not least, there must be minimal data retention. If any data is to be collected, it must be held in their servers for a very minimal time only. A good example for this is SimpleX - for the minimal data they keep, it is only stored for 48 hours on their decentralized network: once that time passes, they're gone forever.
So, which apps fulfill these criteria? Just to name a few: SimpleX, XMPP + OMEMO, Onionshare Chat, JitsiMeet.
None of these applications require any PII, don't collect metadata and have a minimal data retention policy.
Now, let's take a deeper look into the app Signal: despite its strong encryption, it still collects significant data.
Signal: Preying on Metadata Despite Implementing E2EE
Well, you might ask, how does Signal even collect metadata despite offering end to end encryption?
Consider this scenario: a group of anti-ICE protesters are protesting "anonymously" on the streets. This group of protesters used Signal to communicate with eachother. While the content of the messages remains a mystery, the metadata behind them, like phone numbers and frequent contacts still remain a vulnerability.
Let's say Bob is apprehended by the police. His phone is seized, and they discover that Signal is installed on it. With a subpoena, the police can easily retrieve the phone number linked to Bobβs Signal account. They discovered who Bob talks to the most frequently. One specific phone number stands out, so the police investigate further. It turns out that this number belongs to Mike, an accomplice of Bob during the protest. Since the phone number indicates Mike was in the same area as the protest at the time, itβs safe to assume that Mike was involved in the crowd as well.

As you can see, the seizure of Bobβs phone not only led to his own arrest but also to Mikeβs. By obtaining all the relevant phone numbers, the police were able to easily identify who was behind the crowd of protesters, especially since those numbers were associated with the area where the protest took place at the time.
This leads back to my initial example, it's like a glassbox: while the content of your conversation may be encrypted and unreadable to the outsiders, the metadata surrounding the conversation is still visible.

So, is there a better option out there?
SimpleX: No Metadata AND Correct E2EE Implementations
SimpleX has solved the metadata problem. It has correctly addressed both metadata and E2EE.
Unlike the glassbox metaphor mentioned earlier, SimpleX operates more like a black box - external observers cannot determine who is talking with whom. Metadata is also minimized on top of the already implemented E2EE.
In contrast to Signal, the already minimized metadata is stored on the servers for a maximum of 48 hours before being permanently deleted.
Why you should choose SimpleX
You should use SimpleX if you care about privacy because it solves the metadata problem and provides correct encryption to ensure your privacy.
Privacy is not a privilege. It's a right. Take action for your privacy.
SimpleX is currently the CLOSEST we can get to privacy and anonymity in digital communication.
Suggest changes
petrified 2026-01-31
Donate XMR to the author:
834EJCPE8ZBCairREyP3Ft6XpKDvtN8ki9i7eqLud5midPBztuRiHwV5JpjViT55mVSFoYTogfCjc2n4fwMCr3wyRVAunXU