The internet is a dream for snoopers. Almost every action of our daily lives flow through it at some point or another: emailed appointments, streamed TV shows, web purchases, photos shared with friends. Much of this is completely unencrypted and can be read by any of the various companies who own the tubes the data flows through. Even the bits that are encrypted are usually only encrypted between the end user and the company running the website. Once the data is uploaded, it's often mined, and the data is then sold off to the highest bidder.
Our privacy is so valuable that many companies build their business models on invading it. Facebook doesn't provide free access to a social network because it's a charity; it does it to learn about your life so that it can sell advertising space more effectively. Google doesn’t index the web just to make life easier for you; it does it so it can learn about your life and sell advertising space more effectively. Twitter doesn't... well, you get the picture. The modern internet was eloquently summed up by Andrew Lewis when users complained about changes to the Digg network: "If you're not paying for it, you’re not the customer: you're the product being sold”.
Governments are also keen to investigate the finest details of our lives. They claim this is for national security and to prevent crime, but there's very little evidence that internet surveillance has ever prevented terrorism or made an impact on crime. Instead, surveillance is used to harass critics and entrench government control.
We can fight back!
Many of the underlying technologies of the internet come from a time when only a few people connected to the network, and no sensitive information got shared. If encryption and security were considered at all, they were considered a waste of resources. This can make it seem sometimes as if privacy on the internet is an impossible achievement.
All is not lost. You can’t get back the information that has already leaked out, but you can stop the invasions of privacy from continuing. All the evidence we have says that when it's used properly, modern encryption can't be broken by anyone. Well show you how to use it properly, and help you understand what sort of security each form of encryption provides and which protocols can be trusted to keep your data private.
Encryption
A digital toolkit for keeping data safe
Our best tool against spying is encryption. This is a complex mathematical process of changing data so that someone spying on us can't understand the data. There are three types of encryption:
Shared key (aka symmetric key and private key) encryption
This is where the same key is used to encrypt and decrypt information. This means that if you're communicating with someone, both parties need to know the key. This can cause a chicken-and-egg problem because you can't communicate securely until you both know the key, but you can't share the key until you have secure communications.
Public Key Encryption
Here, different keys are used to encrypt and decrypt data. The two keys are usually referred to as the Public Key and the Private Key. The public key is known to everyone, while the private key is known only to one person. If you want to send someone a message, you can encrypt it with their public key. Alternatively, if someone wants to digitally sign a message, they can encrypt it with their private key. Anyone can then decrypt it with their public key, and be sure that it came from the real sender.
Hashing (aka One-way encryption)
This is unlike the other forms of encryption because once data's been hashed, there's no way to un-hash it. The one redeeming property of hashing is that it's consistent. That means that when you hash the same value, it will always return the same result. For example, passwords should be stored hashed. When a user enters their password, the computer hashes what they enter and checks that hash against the stored value. If an attacker steals the stored password hashes (provided the password's can't be guessed), they can't actually use them.
These three types of encryption are combined in various ways to form encrypted protocols that we can use to secure our data and communications.
When we talk of privacy, there are a number of different things that we could mean. It's important to understand the different guarantees that each protocol attempts to establish so we know exactly how private our communication is.
Secrecy, where no-one can see the contents of our communication. However, it is possible that someone eavesdropping on a secret communication could find some information out, like who is communicating with whom. They should not, however, be able to see the data that's being transmitted between two parties.
Metadata secrecy, where no-one can see who we're communicating with. They may see that a stream of information comes out of our machine, but can't track where it's going, or even what form of communication it is.
Non-repudiation/Tamper-proof, a way of guaranteeing that the person who said something really said it. This is useful because it stops people impersonating other people, and
Anonymity. In a truly anonymous system, no-one can tell who another person is unless they deliberately reveal themselves. In some cases this is a good thing, because it allows whistle blowers to report on issues and even the person they're blowing the whistle to can't tell who they are (and therefore can't betray them). An anonymous system could include some form of online identity system, but not a way to link that identity to a real person.
Spying programs
Governments are vacuuming up huge troves of data on civilians...
Prism
Almost all the communication gathered by Prism is sent encrypted, but can still be gathered by the NSA because it's not encrypted for its entire journey. Take, for example, an online chat in Facebook. The messages are sent via HTTPS communication from your browser to Facebook. They're then sent from Facebook to the recipient via encrypted HTTPS, which again can't be sniffed. This means that at no point is the message transmitted unencrypted. However, Facebook has access to the unencrypted message, so it can relay it to a third party. In the case of Prism, the third party is the NSA, but Facebook also uses this information to tailor adverts. The fact that your messages are stored in Facebook's data centre also means that your messages could be read by any hacker who manages to get access to this.
The only method of defeating this form of spying is end-to-end encryption. This is where a message is encrypted by the sender and not decrypted again until it reaches the recipient.
XKeyscore
XKeyscore isn't a standalone surveillance program in itself, but a front-end for all the data amassed by the NSA. It's the system that brings everything together and enables an analyst to instantly access all the information stored about another person. Everything from mundane Facebook chats to the details of your browsing history to phone calls can be accessed from a single place.
So, if you think that text messaging a server password won't be linked to your online accounts where the server details are stored, it's time to think again. All your communications are linked to all your others (unless you're using carrier pigeons or smoke signals).
The best defences against XKeyscore are end-to-end encryption to stop a communication appearing on one of the back-end databases linked to the program, and true anonymity can mean that a particular communication can't be linked back to you.
Tempora
Much of the data travelling to and from the west of Europe goes via the Cornish seaside town of Bude. Here, and on nearby beaches, cables that travel to Canada, the east coast of the USA, the west coast of Africa and beyond slide below the waves and into the murky depth below. If you want to be able to sniff global internet traffic, you need a presence at Bude. It should come as no surprise that GCHQ runs one of its regional sites here, and has taps on every major cable coming into and out of Bude. These cables contain telephone (voice and SMS) data as well as internet communications.
Project Tempora is run by GCHQ (with assistance from the NSA), and collects data directly from internet cables such as those tapped at GCHQ Bude. GCHQ has tapped at least 200 10-Gigabit cables and can process information from up to 46 of them at a time. So much data is collected through Tempora that GCHQ can't store it for long. It holds on to the full data for three days, and the metadata for 30 days. At least, that was the capability of Tempora in 2012, according to information provided by NSA whistleblower Edward Snowden.
If you're using the internet in the UK, it's almost certain that your connection will go through a GCHQ-monitored cable. If you're in mainland Europe, it's still quite likely that it will be picked up by British spooks. Many major internet companies have their European headquarters in Ireland, so most communications in or out of these data centres go through GCHQ-monitored cables as well. Anything that isn't encrypted will be extracted. Anything that is encrypted will have any available metadata extracted. There's little oversight of GCHQ, so it's impossible to know exactly what they're doing with all this data.
Dishfire
The NSA is attempting to collect every SMS message in the world using a system known as Dishfire. According to one GCHQ document, "[Dishfire] collects pretty much everything it can, so you can see SMS from a selector which is not targeted." In this context, a 'selector' is a person, so the document is showing the system collecting text messages from people who the agency have no reason to be suspicious of.
Usually, GCHQ isn't allowed to perform this sort of indiscriminate collection and analysis of British citizen's data (although oversight is minimal). However, in this case, GCHQ bypasses the Regulation of Investigatory Powers Act (RIPA), since it’s technically the NSA that collects the data (it then shares it with the UK), and RIPA doesn't cover data that's shared by a foreign intelligence agency. The same loophole works the other way, since the NSA isn't allowed unfettered access to US citizens' data. Each agency collects data on the other country's citizens, and they exchange it. Thus each government follows the letter, but not the spirit, of the law.
Perhaps the most disturbing aspect of Dishfire is that it doesn't just include the content of the text message, it also attempts to locate the position from which they're sent. This makes it also a database that can be used to track people (again, this is everyone, not just those suspected of wrongdoing).
We don't know how long they store the data for, but leaked slides have shown just how much information the system is gobbling up. Every day it collects:
- 200,000,000 text messages
- 76.000 geolocated text messages
- 800.000 financial transactions (from text-text payments or credit cards linked to phones)
- 1,600,000 pieces of information on border crossings (from roaming information texts)
- 5,000,000 missed call alerts
Marina Mainway
In theory, US agencies aren't allowed to spy on US citizens unless they're suspected of some crime. However, there are many loopholes that the NSA exploits. Marina bypasses this restriction by not storing the content of the communication, but keeping the metadata instead.
Lawyers may argue about the difference between data and metadata, but in reality the NSA can build up a huge amount of information on someone using metadata alone.
Marina is a database of internet metadata, while Mainway stores phone metadata. Between the two, the NSA can build up a picture of your life, from your friends, to the places you frequent and the websites you visit. All this bypasses spying laws because, technically, it's not data. The difference, though, only matters to lawyers.
EU Data Retention Directive
On 15 March 2006, the European Parliament and Council issued a directive stating that all member states must require telecommunication providers (such as phone companies and ISPs) to store users' data for at least six months and at most two years. This data should include things like IP addresses, email addresses, phone numbers called, text messages sent, etc.
On 8 April 2014, the European Court of Justice declared the Data Retention Directive to be invalid, though, many EU member states still require telecommunications companies to collect information about all their customers. Indeed, the UK plans to bring in even more laws regarding surveillance.
The invalidating of the Data Retention Directive by the EU does open up the possibility that these national laws could also be invalidated at the European level (as yet, no nation's blanket international surveillance has been tested in court).
However, a legal study financed by The Greens and the European Free Alliance concluded that, 'The Court clearly rejects the blanket data retention of unsuspicious persons as well as an indefinite or even lengthy retention period of data retained" This study isn't legally binding - it's the opinion of legal experts. It states that citizens of a nation could challenge the national laws through the European Court of Human Rights. Much of the legal position on this is based on Article 8 of the European Convention on Human Rights (ECHR).
Glossary of spying terms
Five-eyes
An information sharing network made up of USA, UK, Canada, Australia and New Zealand.
Man-in-the-middle
A form of attack where the attacker positions themselves between the two parties. Here they can both sniff and alter data travelling in either direction.
NSA
The National Security Agency. The USA's spy agency tasked with foreign espionage and securing communications infrastructure.
GCHQ
Government Communications Headquarters. Britain's communications spying agency headquartered in Cheltenham.
Metadata
Data about data. In an email, the contents would be considered data, while the sender, recipient, subject, date and associated IP addresses would all be considered metadata.
FISA
Foreign Intelligence Surveillance Act. A US federal law that is used to legitimise much of the NSA's spying through a very flexible interpretation of the word Foreign.
Cookie
A piece of data stored by your browser that can be set by a web server. This can be used for tracking a user’s session (such as keeping them logged in to a site), or tracking their movements through the web.
Europe vs Facebook
A legal case that's being brought against Facebook for allegedly breaching European data protection law. Fingerprinting A method of identifying a user based on the settings in their web browser - see Panopticlick by the EFF (https://panopticlick.eff.org/index.php).
Snooper's Charter
A proposed law in the UK that would bring in sweeping new powers to allow the government almost unfettered access to internet data in the UK.
Human Rights Act 1998
A piece of UK legislation that the current government wishes to repeal. It includes Article 8 (Everyone has the right to respect for his private and family life, his home and his correspondence).
Private web browsing
Don't let everyone know what you do on the web
Normally, when browsing the web, nothing is encrypted. All traffic is sent in the open and can be intercepted by a huge number of people. This includes the packets sent from your computer requesting data from the server, and the data the server sends back. This open communication is known as the hyper text transfer protocol (HTTP).
Even very early in the development of the web, it was apparent that not all traffic should be sent in the open. In 1995, Netscape released SSL (a layer of encryption that can be used to protect protocols that are normally unencrypted), and for the first time, browsers and web servers could communicate privately. HTTPS (the S stands for secure) is the protocol for this data exchange.
When it's working properly, HTTPS guarantees two things: no one can read the traffic, and no one can alter the traffic. There are caveats to both of these, but HTTPS is a huge improvement in security over HTTP Anyone intercepting traffic can see what web servers you're getting data from, but not the data they send.
Rerouting connections
Web proxies are servers that you route your connection through. This means that instead of your browser sending a message to a server saying what page you want to view, it sends a message to the proxy saying what page you want to view. The proxy then requests this page from the server, and sends the resulting page back to you. If the connection between your computer and the proxy is encrypted, no-one can see what server you're requesting pages from (except the proxy itself). If the page also uses HTTPS then no-one (except the proxy) can see or alter the data from the server. However, the proxy is in an extremely privileged position. They can see just about everything you're doing on the web. Many companies that provide proxies promise that they delete logs (or don't keep them at all), but there’s no way of confirming this. In many cases, proxy providers will be bound by national laws to turn over information to the authorities, or data could be stolen by hackers. In other words, proxies only provide security if the organisation running the proxy behaves well. If they don't, then proxies can provide less security than plain HTTPS.
The onion router
If you need anonymity online, the most robust option is to use Tor. This is a network where you communicate through a chain of three proxies. You first establish a connection from your machine to one proxy. Then, through this proxy, establish a link to a second, then through this establish a link to a third, then through the third, connect to the web. In this chain, the first proxy can see your IP address, and it can see the IP address of the second proxy you're using. The second proxy can see the IP addresses of the first proxy and the third proxy, and the final proxy can see the IP address of the second proxy and it can see the web traffic. This means that even if one of the proxies in the chain is spying on you, it can't work out who you are and what you're viewing. Of course, if an adversary controls a large portion of the nodes in the network, then they may be able to de-anonymise the traffic.
The Tor network provides anonymity, but not security. That means if you're browsing the web over unencrypted HTTR people will still be able to see what you're reading (or sending), but they won’t know who is reading or sending it. Therefore, it's important to use additional encryption appropriate to the type of communication you want to do (eg HTTPS, GPG or ORT - see below) in order to get both anonymity and privacy.
There are two ways of using Tor to browse the web: install the software on your system, or use a live distro that comes with it preinstalled. To install the software, go to https://www.torproject.org and select the Download link. It should automatically detect the operating system that you're running, but you'll need to make sure you use the correct link to get the 32- or 64-bit version.
If you don't want to install the software on your machine, or don't trust the operating system not to spy on you, then running a live distro is the best option. There are a few options, but by far the most trusted is Tails (https://tails.boum.org). You can run this from a CD, USB stick, or as a virtual machine. It has everything set up and ready to run, but you do need to make sure that you download any updated versions as they come out to ensure that you always have protection.
Whichever option you choose, once you've started the Tor Browser, you'll see that it's a modified version of Firefox ESR (Extended Support Release). If everything has gone correctly, you should see a green page that states, "Congratulations! This browser is configured to use Tor" If you see this, you can start browsing the web anonymously. However, it is worth reading the page linked as Tips On Staying Anonymous (https://www.torproject.org/download/download.html.en#warning) to make sure you fully understand what Tor does and doesn't do.
There are a number of privacy/convenience trade-offs when it comes to web browsing, such as which cookies to accept. It can be hard for non-technical people to understand what the issues are, and decide where to draw the line. The Tor Browser has a slider to enable you to increase or decrease privacy levels (and consequently decrease or increase the functionality of the browser). If you go to the onion drop-down menu in the top-left corner, and select Privacy and Security Settings, you'll get a pop-up box that lets you adjust the features you want.
Cookies, trackers, web beacons - Following your browser
Advertising companies don't need to resort to monitoring data flowing through wires in order to track users: your web browser will tell them everything they need to know. Cookies are bits of data that can be set by a remote website and are stored on your browser. They're most commonly used to set an ID so that a website can tell which requests come from a single browser. Every time your browser requests a page from a server, it will send details of any cookies set by that domain along with the request.
When used responsibly, they're good for web users. For example, they enable web shops to follow the user as they browse the store and add items to their shopping cart. The real problem with cookies comes when a website loads content from more than one source. For example, if you go to a website with a Google advert or a Facebook-like button, your browser has sent a request to Google or Facebook, and the tracking cookies will be sent along with that request. Since a huge number of pages include content from advertising companies, these companies get a very complete picture of your browsing habits.
Most web browsers enable you to set how your browser sends cookie information at three levels: all cookies; no third-party cookies; and no cookies. The 'All cookies' option allows any advertisers to track you. ‘No third party cookies' only allows cookies associated with the domain that the main web page you're viewing is from. This is a good option if you're concerned about being tracked by advertisers, but willing to accept less than 100% privacy for the convenience of websites being able to remember some information about you. Picking the 'No cookies' option may cause issues with some websites, but will give you more confidence that you're not being tracked.
Certificates
All encrypted communication requires some form of shared information to start. This could be a passcode that both parties know or an encryption key. In the case of HTTPS, it's certificates. These certificates include a public key for the organisation, and some information about how to use the certificate (what organisation it's valid for, what dates it's valid for etc).
When you install a web browser, it comes with some certificates installed by default. These are root certificates, and the browser trusts the organisations that issued them completely not just to encrypt traffic, but to verify other certificates. When you visit a HTTPS website, the web server sends a certificate that has been cryptographically signed by a certificate authority. If the signature on this certificate matches one of the root certificates in your browser, then the page is accepted as valid.
This means that the entire basis for the security of HTTPS lies in these root certificates. If some malicious party manages to get the private key to one, they can break every bit of security in HTTPS. This also means that if someone can install a new root certificate on your computer, they have complete control over your web traffic. Many companies install root certificates on employees' browsers to allow the organisation to monitor and control internet activity.
Communications
How to keep your online chats private
The internet is about far more than just browsing the web, and the most important area for privacy on the net is online communications. There's a good reason that you put letters in envelopes in real life - you don't want everyone reading your mail. In the digital world you should ensure the same level of privacy by using strong encryption.
Email is still one of the most common forms of digital communication; however, it has no security built in. None. By default, there’s no attempt to encrypt the communication, and no attempt to even verify that the person sending the message is really who they say they are. Over time, some solutions to these problems have emerged, but they're not universally applied. When sending or receiving an email, you should assume that there's no security at all.
When using webmail, bear in mind that many webmail providers make their money through advertising and may be mining your mail for information about you that can be used to better sell advertisements to you. Therefore, the first thing you need to do if you want private email is to use a mail provider that's not spying on your mail. This means not using an advertising-driven mail provider. Riseup.net is a good option. Another is to host your own email server, though this can be a little involved. You should be wary of any email provider that makes exaggerated claims about the total privacy of their system since, this isn't possible using the current email setup unless you use end-to-end encryption.
Many email platforms offer encryption to the server. On web-based email this is a HTTPS web page; on a regular server, this will be something like STARTTLS. This is an essential bit of encryption, because without it, the email is readable by anyone. However, alone, it doesn't offer any guarantees of privacy because the mail server could be reading the email, and it could send it unencrypted to the recipient's mail server. End-to-end encryption is needed to ensure privacy. This means that you need to encrypt it yourself before you send it, and this needs to be done in such a way that only the person receiving it can decrypt it. The standard method for this is Gnu Privacy Guard (GPG). This can be used in two ways: encryption and signing. Encryption means that only the intended recipient can read the email, while signing means that anyone can read it but it guarantees that the mail came from the person that signed it.
GPG uses public-key encryption for verification of identification and key-exchange, symmetric encryption for privacy and hashing for signing. In order to use GPG you have to create your own public key, and get the public key of anyone you wish to communicate privately with. You can either do this by exchanging key files in person, or by using a key server.
Thank GNU for privacy
The method of setting up GPG varies significantly depending on what mail client and mail server you're using. Unfortunately, there isn't yet a simple solution that works across the board. You should look up the advice for your setup on the mail client's website. When properly set up, GPG protects the contents of the message, but doesn't hide who is communicating with whom. This, and other metadata stored in the email header, may still be sent in plain text.
While there's no easy way of hiding the metadata in an email (or a good alternative that can be guaranteed to be secure), there are some options to mitigate the problem. You can completely hide your location by accessing webmail through Tor. This means that it's impossible to link the email to the physical location sending it. If you do this, and use different email addresses for different things, you can achieve a reasonable level of anonymity even though the metadata is still public.
While email is still hugely popular, instant messaging (IM) can be more convenient. Like email, there's often little security built into IM solutions by default, and many IM platforms are run by advertising companies that mine the chat sessions for data. Many proprietary IM platforms make claims about privacy and security, but are very hazy on the details.
Getting chatty
For privacy, you need end-to-end encryption, not just encrypted communications to the control server. Off The Record (OTR) is a layer of end-to-end encryption that runs on top of an IM session to provide privacy. It can run on top of any instant messaging platform, but the developers of the Tails distro recommend that it's only used with IRC and Jabber (or other XMPP platform). OTR is a plugin for Pidgin, and you can download the source code or Windows binaries from https://otr.cypherpunks.ca. It's in most distros' repositories, but make sure that you have the latest version (check the OTR website for up-to-date details).
Another option is to use OTR and Pidgin through the Tails live distro. This is a good option if you plan to use OTR through Tor, since using Tails will ensure that everything is set up correctly. There are details of the Tails OTR setup at https://tails.boum.org/doc/anonymous_internet/pidgin/index.en.html.
Both parties in the communication need to have OTR installed for it to work (if you try to initiate an OTR session with someone who doesn’t have it installed, they'll get a message telling them how to install it).
The first time you chat with someone, you need to make sure they are who they say they are. OTR offers three different methods of authentication:
Shared secret
Using this method, both users see a text box and have to type in some text. If they both enter the same text, they're authenticated with each other.
Question and answer
One user poses a question to the other, and enters what they think the answer should be. If the other person enters the same answer, then they are authenticated.
Fingerprint
Each user has a hexadecimal string linked to their username that’s known as a fingerprint. They can share this string with other people either when they meet them in real life, or by some other means of secure communication. OTR displays both users' fingerprints, and if they match what the users are expecting, they can authenticate each other.
The first two can be used to authenticate someone you know, and don't rely on you being able to exchange cryptographic keys in any way. They just need you to be able to come up with something that you'll both know. The final method can be used if you've already exchanged digital fingerprints.
OTR isn't anonymous, and people can still see who you've communicated with. However, the messages are designed in such a way that even though a spy can see that a message has been sent, they can't verify that it was signed by a particular public key, and there's a tool to generate fake messages (ie messages that appear real to a spy, but are garbage to anyone involved in the chat). This means that, while it's not truly anonymous, there is some deniability, since no-one except the intended recipient can prove a particular message was a real message sent by you and not a fake.
On the move
It's not just messages sent via the internet that are routinely intercepted: phone communications are too. Both voice and SMS messages are sent unsecured and are intercepted by phone companies. Our recommendation for private communication on the go are the tools by Open Whisper Systems (https://whispersystems.org) These include Text Secure (an encrypted mobile instant messaging platform) and RedPhone (an encrypted voice caller). Both of these are available through the Google Play store and iTunes.
An added advantage is that both of these apps are free to install and use and neither comes with advertising. Instead, the software is funded by grants from privacy advocates such as the Freedom of the Press Foundation and the Shuttleworth Foundation (as in Mark Shuttleworth, the Self-Appointed Benevolent Dictator For Life of the Ubuntu Foundation). It's not just us recommending these tools. They come with an endorsement from Edward Snowden himself who said, "Use anything by Open Whisper Systems."
Next generation private communications
All the methods of communication we've looked at in the main text are client-server. That means that your messages are first sent to some central server, and then on to the intended recipient. They can be secured through end-to-end encryption, but it's hard (or even impossible) to protect metadata, and potentially, an encrypted service could be forced offline by an overzealous government that wants to limit the options for secure communications (as happened with the Lavabit email service).
The alternative is a peer-to-peer setup, similar to how BitTorrent works. A service like this would be impossible to shut down. At present, there isn't a widely-used peer-to-peer chat system, but there are a couple in development.
Ricochet (https://github.com/ricochet-im/ricochet) uses Tor, and each peer has its own hidden service as its interface with the network. This provides a strong degree of anonymity (though not perfect as law enforcement agencies have been able to de-anonymise hidden services in the past).
Tox (tox.im) focuses less on anonymity, and more on having a robust network that's hard to shut down, and on secure encryption.
While both these projects are potentially very valuable assets in the fight for privacy, at present we can’t recommend either of them for secure communications because they are simply too immature. They're under rapid development, and that could lead to bugs. However, in the future, when they settle down, they may provide good alternatives to the traditional client-server tools.