docs.rodeo

MDN Web Docs mirror

Privacy on the web

People use websites for several important tasks such as banking, shopping, entertainment, and paying their taxes. In doing so, they are required to share personal information with those sites. Users place a certain level of trust in the sites they share their data with. If that information fell into the wrong hands, it could be used to exploit users, for example by profiling them, targeting them with unwanted ads, or even stealing their identity or money.

Modern browsers already have a wealth of features to protect users’ privacy on the web, but that’s not enough. To create a trustworthy and privacy-respecting experience, developers need to educate their site users in good practices (and enforce them). Developers should also create sites that collect as little data from users as possible, use the data responsibly, and transport and store it securely.

In this article, we:

Defining privacy terms and concepts

Before we look at the various privacy and security features available to use on the web, let’s define some important terms.

Privacy and its relationship with security

It is hard to talk about privacy without also talking about security — they are closely related, and you can’t really create privacy-respecting websites without good security. Therefore, we shall define both.

Personal and private information

Personal information is any information that describes a user. Examples include:

Private information is any information that users do not want shared publicly and must be kept private (i.e., information that is accessible only by a certain group of authorized users). Some private data is private by law (for example medical data), and some is private more by personal preference.

Personally identifiable information

Following on from the above section, personally identifiable information (PII) is information that can be used, in whole or in part, to track down and/or identify a specific person. For example, if a site leaks a list of users’ names and zip codes online, a bad actor could almost certainly use this information to find their full addresses. Even if a full-scale leak does not happen, it is still possible to identify users through less obvious means, such as the browsers they are using, the devices they are using, specific fonts they have installed, and so on.

Tracking

Tracking refers to the process of recording a user’s activity across many different websites. This can be done in various ways, for example:

Tracking data can be used to build a profile of a user and their interests and preferences, which is usually bad and can be annoying to various degrees. For example:

Fingerprinting

A process very closely related to tracking is fingerprinting: this specifically refers to identifying users by building up a store of data points about them that differentiate them from other users. This could be anything from cookie contents to what browser they are using and what fonts they have installed locally.

Modern browsers take steps to help prevent fingerprinting-based attacks by either not allowing information to be accessed or, where the information must be made available, by introducing variations or “noise” that prevent it from being used for identification purposes.

For example, if a website queries a user’s browser for the elapsed time, a comparison of that time to the time reported by the server might be useful as a factor in fingerprinting. Because of this, browsers typically introduce a small amount of variability to timers to make them less useful for identifying the user’s system.

[!NOTE] See Fingerprinting on web.dev for additional useful information.

Privacy features provided by browsers

Browser vendors are aware of the need to protect user privacy and the negative effects of tracking, fingerprinting, etc., on user experience. To this end, they have implemented various features that enhance privacy protection and/or mitigate threats. In this section, we look at different categories of privacy protection that browsers apply automatically.

HTTPS by default

Transport Layer Security (TLS) provides security and privacy by encrypting data during transport over the network and is the technology behind the HTTPS protocol. TLS is good for privacy because it stops third parties from being able to intercept transmitted data and use it maliciously, for example for tracking.

All browsers are moving towards requiring HTTPS by default; this is practically the case already because you can’t do much on the web without this protocol.

Related topics are as follows:

Opt-in for “powerful features”

So-called “powerful” web API features that provide access to potentially sensitive data and operations are available only in secure contexts, which basically means HTTPS-only. Not only that, but these web features are gated behind a system of user permissions. Users have to explicitly opt in to features like allowing notifications, accessing geolocation data, making the browser go into fullscreen mode, accessing media streams from webcams, using web payments, etc.

Anti-tracking technology

Browsers have implemented several anti-tracking features that automatically enhance their users’ privacy protection. Many of these block or limit the ability of third-party sites embedded in {{htmlelement("iframe")}} s to access cookies set on the top-level domain, run tracking scripts, etc.

Privacy considerations for client-side developers

There are several actions web developers can and should take to improve privacy for their users. The below sections discuss the most important ones. Some of the categories are not purely technical tasks as such and will involve collaboration with other team members.

Collect data ethically

Companies collect lots of different data from their users for a variety of different reasons:

When collecting data from your customers, you have an opportunity to behave with integrity, show them that you are trustworthy, and build a great relationship with them, in turn, improving your brand and your chance of success.

The ethics of data collection can be broken down into three simple principles:

[!NOTE] The tips provided below make for a better, more privacy-aware user experience, but many of them are required by law to comply with regulations, for example the GDPR in the EU. You should make sure to find out what regulations apply to you in your locale, and what you need to do to comply with them.

Don’t collect more data than you need

It is tempting to ask for a lot of data from your users because you think it might be useful in the future. However, every bit of extra data you collect adds risk to your users’ privacy and increases the chance that they will abandon the step they are performing (whether it is filling out a survey or signing up for a service).

It is good to anonymize data. You should also consider whether you can get what you need by making your data request less granular. As an example, instead of asking a user their favorite products, you could ask them to select between more general categories.

The best way to protect user privacy though, is to minimize the data you collect. Referring to the previous example, you could infer the same data by looking at user purchase history. As another example, users appreciate being able to buy products anonymously. You shouldn’t force them to sign up for an account; if it’s not necessary for the service to operate, it should be their choice.

Communicate clearly how you are going to use the data you collect

Once you have decided what data you are going to collect, you should publish a privacy policy on your site that clearly states:

When providing you with data, your users should be given an opportunity to read your privacy policy, and consent to it. They should be able to control if they are happy with this and agree to your terms. And as indicated above, they should also get to see what data of theirs you have collected, and delete it if they want to.

When you’ve published your privacy policy, you need to make sure that you comply with it — doing what you say you are going to do is very important in building user trust. You should only collect the data you say you’ll collect, and only use it for the purpose you say you’ll use it for. If someone from your company comes up with a clever new way to use existing data, that still isn’t OK under the terms of your policy if it doesn’t specify that you’ll use it for that purpose. If users consented to the use of their data for a specific purpose and that purpose expands, you may have to consider obtaining new consent.

Delete the data once you have finished with it

Earlier on, we mentioned giving users a way to see what data of theirs you have collected, and delete it if they want to. You could possibly do this as part of the same experience they can use to delete their account (their data goes with it), or make them two separate options. Either way, the options should be easy to find.

Allowing the user to choose when significant portions of data get deleted is very empowering, and builds trust, but there may be some bits of data that you will want to handle deletion of yourself. For example, some data might only be used for a few hours or minutes and then deleted, like data that is used during the administration of a user’s session while they are logged in.

[!NOTE] The {{httpheader("Clear-Site-Data")}}  HTTP response header is very useful for clearing short-lived user data — it instructs the browser to clear out its cache and/or cookies and/or storage (e.g. Web Storage or IndexedDB data). For example, you might get your server to send it along with a “logged out confirmation” page so that once the user is logged out, their data is safely removed.

Cut down on tracking

Earlier on we discussed tracking, and some of the unethical purposes it is used for. We shouldn’t have to spell out how such uses can erode user trust; wherever possible, you should only use potential tracking mechanisms like third-party cookies for ethical uses, such as transferring sign-in or other personalization status across sites.

Also recall from earlier that browsers are all starting to block third-party cookies by default, while implementing alternative technologies to achieve common use case. It is a good idea to prepare for this, by limiting the amount of tracking activities you rely on, and/or implementing desired information persistence in other ways. See Transitioning from third-party cookies for more information.

Carefully manage third-party resources

Of course, it would be easy to manage privacy if you were only worried about resources you have created (code, cookies, sites, etc.). The real challenge comes from the fact that your site will likely use third-party resources. This can include third-party content embedded in <iframe>s, libraries, frameworks, APIs, externally-hosted resources such as images and videos, etc.

Third-party resources are an essential part of modern web development, they provide a lot of power. However, any third-party resource you allow onto your site potentially has the same permissions as your own resources; it all depends on how it is included on your site:

It is important to audit all of the third-party resources you use on your site. Make sure you know what data they collect, what requests they make and to whom, and what their privacy policies are. Your carefully designed privacy policy is useless if you use a third-party script that violates it.

[!NOTE] There are various tools out there that can help you build up a picture of what requests a site is making, for example the Request Map Generator.

Once you have audited your third-party resources and understand what they are doing, you should then consider their negatives as a trade-off for the value they bring. If a third-party script is free and really useful but collects quite a lot of user data, you could:

  1. Accept that trade-off, update your privacy policy to include details of it, and hope that it doesn’t impact your users’ trust too much.
  2. Look for an alternative, less data-hungry third-party tool.
  3. Build your own tool.

The following list provides some tips on how to mitigate privacy risks inherent with using third-party resources:

[!NOTE] See Third parties over on web.dev for additional useful information on auditing and more.

Protect user data

You need to make sure that user data is transmitted and stored securely once you’ve collected it. This is more of a security topic, but it is worth mentioning here — a good privacy policy is useless if your security is lax and attackers can steal the data from you.

The below tips offer some guidance on protecting your user’s data:

See also

In this article

View on MDN