Down the Firefox Rabbit Hole

May 6, 2018 - Reading time: 3 minutes

Firefox will be adding some kind of ads to the new tab screen. They call it "suggested content" or something like that, but you can be sure it will really be primarily about ads. The classic open source thing is that if someone includes an antifeature then you can just remove that part of the code, recompile and continue. I had never actually read the source code for Firefox, so this was also an interesting exercise.

While reading the code I soon ran into things related to telemetry. There's a python script in the build directory which sends telemetry data unencrypted to a static IP address. I don't want anything reporting back to the mothership, so the question then became one of whether I could remove the telemetry code, as an exercise to see what's involved in doing that.

What do I mean by telemetry, you may ask. The word is normally associated with NASA and space probes sending back images from other worlds, but in this context it means monitoring what users are doing with some software and then sending that data back to some central location for analysis.

The telemetry code which I initially found turned out to be just one small corner of a much bigger thing. The main telemetry code is complex and it's difficult to determine where it pushes out exfiltrated data onto the wire. There appears to be a substantial ecosystem around spying on Firefox users, in which arbitrary queries can be performed and bespoke requests for data can be submitted for approval. The level of monitoring by what is known as "telemetry probes" seems really detailed. It's as if the software is a sophisticated instrumentation system for monitoring users with some web browsing features also appended.

If you then start reading the associated mailing lists and bug tracker entries it gets into a new level of creepyness, in which users appear to be being talked about as if they were "inventory". A quote from here:

"It is my understanding that our advertisement clients will rely heavily on our inventory projections, to the extend of budgeting campaigns based on the number we provide. If our projections are wildly wrong, clients will under or over budget, which will corrode industry trust in Mozilla brand"

And then some stuff about experimenting on users:

"Users can only run with a single experiment at a time. We have other experiments pending, and so it really is not practical to ship this as an experiment and operate on large samples of the beta population."

On the mailing list they talk about Firefox as a "data platform", with strange surveys. It's a side of Firefox that I'd never encountered before, because I was only coming at it from a user angle. I'm not entirely sure how to feel about it, but it seems resolutely within the creepy corporate surveillance arena. Previously I had known at a high level from their financial statements that Mozilla Corporation sells user behavior data to search engines (that probably means Google primarily), but seeing how the sausage is made is another matter. It's a bit like the difference between knowling at a high level that the NSA is spying on everyone and then later reading the Snowden documents containing the gory details of how they're doing it and being a lot more disturbed.

So if you're a Firefox user what should you do about this? First of all don't panic. The derivatives of Firefox, such as the Tor browser, have all this telemetry stuff deactivated. If you are using a vanilla Firefox browser then block telemetry.mozilla.org on your firewall and maybe also add it to /etc/hosts to make sure it doesn't resolve. If you're doing software development on the Firefox code then block the IP address 52.88.27.118 where it sends build information to.


The Stallman Directive

April 4, 2018 - Reading time: 3 minutes

In an episode of Linux Unplugged they talk about Richard Stallman's proposed solutions to the problem of companies spying on people and then using the data in dubious ways. After a lot of meandering the actual discussion is about an hour into the show.

So what's the solution to this? Cambridge Analytica isn't the first company to use data in sketchy ways and it won't be the last. I also don't really agree with Stallman that legislation is the answer, since here in the UK the data protection act has existed for decades and even though there are many violations of it it's largely ignored.

For example, the data protection act says that data collected about people is supposed to be used by the "data controller" for a specified purpose, not for purposes different from the one for which the data was originally supplied, and also that people should be able to obtain copies of their data without unreasonable delay. When you think of the world of advertising companies and data brokers and so on it's easy to see that these basic rules are being broken routinely. Data supplied for one reason ends up being used for entirely other purposes. Maybe somewhere in the terms of service there are buried descriptions of what happens to personal data, but realisticly nobody except lawyers reads those documents and the problem boils down to what constitutes meaningful education and consent.

Things that have been tried and which we know don't work are:

  • Legislation similar to the data protection act. It very rarely or never gets enforced.
  • Simplified terms of service documents with fancy coloured icons. Still nobody reads them. In an era of technology monopolies often users don't have a realistic choice about whether to sign up for a service or not.
  • Naming and shaming companies when they abuse personal data. They just carry on doing the same anyway.
  • Browser plugins which do client side encryption. Have existed for a long time but since they're not installed by default practically nobody uses them.

In the Linux Unplugged episode FreedomBox is mentioned as a possible solution to the data ownership and privacy problem. I like this idea, but I think there's also another possibility which is non-corporate community management of systems - especially social networks. That is, the kind of federated model which exists already on the Open Web. To some extent the work involved with storing and managing communications data can be collectivised within an affinity group so that each user of the system doesn't have to take on the whole responsibility by themselves.

A couple of years ago it would have been easy to dismiss the federated model as something old-fashioned, perhaps resembling the bulletin board era before the internet, but now there are thousands of Mastodon installs and what appears to be very active communities around them who are not just the previous demographic of hardcore Stallmanites. What exists today is a pretty substantial proof of concept for an exit strategy from the current data dilemmas. It's not that today's fediverse is ultra private - far from it - but it's conceivable that better privacy features could be added.

What I think organisations such as FSF, EFF and ORG need to be doing is getting behind projects like FreedomBox and promoting them and showing people how to install and maintain them. If data is increasingly managed in a non-corporate way and perhaps also at a more municipal level then at least when it comes to devising legislation the pro-privacy side of things will be in a much stronger bargaining position.