Most websites you run across while browsing the Web include content which tracks your online activity. This practice has multiple goals: advertising is the most common, with tracking companies generating digital profiles based on users’ interests to propose targeted ads. Then, analytics and security services leverage tracking for their purposes. Finally, unfortunately, also malicious actors such as phishers and scammers exploit web tracking to increase attack success probability.

Conventional tracking tools (HTTP cookies, local storage, IndexedDB, etc.) allow trackers to store unique identifiers in browsers to re-identify users during the browsing activity. As privacy concerns have grown among users, the most tech-savvy of them started adopting tools to prevent trackers to install such identifiers. Hence trackers have evolved to bypass these blocks and developed a new technique to precisely identify users in multitudes without relying on cookies or other kinds of client-side storage: this is the case of browser fingerprinting.

In this article we describe this pervasive tracking technique, explain how it invalidates most existing anti-tracking systems and present our solution to the problem.

What is browser fingerprinting?

Browser fingerprinting, similarly to conventional tracking techniques, allows trackers to assign identifiers to users’ browsers and track them while they are being used to surf the web. The difference with older tracking techniques stands in how these identifiers are generated and handled.

While conventional tracking techniques randomly generate identifiers on the server side and store them in cookies or browser storage, fingerprinting exploits the browser to execute pieces of JavaScript code silently running in the background, to create and install identifiers directly on the tracked device. These identifiers are generated based on properties such as the machine’s operating system, language, time zone, screen size and resolution, installed fonts and plugins, graphical and audio stack characteristics, battery status and capacity, browser type, browser configuration, browser version, number of CPU cores, touch-based input support and much more.

By collecting this impressive amount of information, trackers can produce a signature of the browser and use it to recognize it in millions.

Visual examples and demos

EFF’s Panopticlick and AmIUnique provide online tools to show how the visitor’s browser can be easily fingerprinted and how unique its fingerprint is. By testing both tools with vanilla versions of Mozilla Firefox 82.0.3 and Google Chrome 86.0.4240.183, it is easy to result unique among the respective user bases.

Panopticlick

Result of Panopticlick demo.

We provide a glimpse of how browser fingerprinting works by presenting one of the most advanced techniques: audio fingerprinting. Audio fingerprinting is a recent fingerprinting technique exploiting the Web Audio API to add entropy to the final device fingerprint, which is typically created combining multiple techniques. It builds on the device’s audio stack properties. In most of the cases, the JavaScript code implementing audio fingerprinting creates a periodic waveform whose properties vary from device to device based on the audio signal processing performed by the specific hardware and software of the device. The Princeton CITP’s Web Transparency and Accountability Project created this website to test how your audio fingerprint differs from a pre-generated sample. The following image depicts the result we obtained from the demo.

Audio fingerprinting

Example of audio fingerprinting waveforms (courtesy of The Princeton CITP’s Web Transparency and Accountability Project)

As shown, local differences in the waveform’s channel data are quite significant and easy to notice in this case. However, even tiny differences between waveforms are enough to build unique identifiers.

Why is fingerprinting more pervasive than legacy tracking?

Since a device always produces the same fingerprint over time, trackers are not required to store identifiers on users’ browser anymore: they can just re-execute the JavaScript code and re-generate them. Second, this prevents users to check and control the identifiers trackers generate: indeed, deleting cookies or emptying the local storage became useless with the advent of fingerprinting.

Fingerprinting is noiseless by design and prevents the user to control her exposure to tracking. This is why it represents a real threat for privacy and security.

What can users do?

A draconian approach to block browser fingerprinting is disabling JavaScript at all. However, modern web deeply builds on JavaScript to provide a huge number of “good” functionalities and services, which contribute to improve the quality of navigation. Thus, for most users this is not a solution.

A common alternative is to “pollute” fingerprints by randomizing, tweaking or faking responses to JavaScript APIs collecting device and browser information. This approach presents lots of shortcomings too: First, most websites use the interested APIs for functional, benign purposes, so returning values which do not correspond to the actual characteristics of the device or browser could break the page. Second, it may be counterproductive: polluting information may lead to generate even more identifying information. Indeed, trackers are well aware of browser extensions and other utilities adopting this approach, and actively implement code searching for detectable modifications and inconsistencies among the returned values in order to check whether the browser is lying or not.

At Ermes we solved this problem. We developed a methodology combining static and dynamic code analysis with machine learning to automatically detect JavaScript code performing browser fingerprinting with over 94% accuracy. We also measured how spread browser fingerprinting is in the wild and how trackers use fingerprinting to produce browser identifiers. Our work has been accepted to be presented at one of the most prominent symposiums on privacy and security, the 21st Privacy Enhancing Technologies Symposium. You can find it here. The paper describes part of the technology we adopt in our everyday battle against fingerprinting and web threat in general.

This technology has been integrated in Ermes For Enterprise and Ermes for SMEs. Get our product to protect your business against fingerprinters of the web!