Data tracking and clickjacking

All timestamps are based on your local time of:

Posted by: stak
Tags:
Posted on: 2010-12-07 13:36:15

Since I'm writing a Firefox plugin for a course on privacy-enhancing technologies that I'm taking right now, I found myself thinking of other Firefox extension that might be useful from a privacy/security point of view. One that I would really like to see is a plugin that separates browser storage for webpages by the origin of the top-level window. For those who aren't familiar with the guts of browsers and the web, here's how that breaks down:

- "Browser storage" includes things like cookies, history information, etc..

- "Origin" refers to the combination of URI scheme, host, and port information. So if you went to http://google.com/ that would have an origin of {"http", "google.com", "80"} (80 being the default HTTP port). Any page under that domain would have the same origin, but https://google.com:80/ would be under a separate origin (since the scheme changed to "https"). The origin of a page is already used in browsers to prevent communication across domains - so for example if you pull in another page into yours using an iframe, you can't programatically read or modify that page if it comes from a different origin. Same thing applies to XMLHttpRequest - you can't make XHR calls across origins.

- "Top-level window" refers to the "main" page being loaded in the browser or the tab. Subwindows could be things like frames or object tags that point to other documents that are embedded into the main page.

So what my suggestion boils down to is stricter separation of pages in different origins. Right now there are two huge issues that I can think of off the top of my head that would be fixed with this.

The first is the ability for websites to track you across the web trivially. Right now, sites like Google and Facebook can easily generate a pretty comprehensive list of websites you visit, simply because they have a presence on all of those websites. Any site that uses Google Analytics (and that's a lot of sites) includes a script from Google's servers. When your browser fetches that script, the request includes the Google cookies stored in your browser. And just like that, Google knows you (as in, the user logged in to Google in your browser) visited that page. The same thing is true of the Facebook "like" button you see plastered all over the web. If you're logged in to Facebook in that browser, Facebook knows you were there. In both cases, even if you aren't logged in, they can record your IP address and link up the information when you do log in later. For anybody concerned about privacy, this is a huge gaping hole that has been known for a while and that nobody seems to want to do anything about.

The second problem is that of clickjacking. This is a vulnerability that hit the public eye in 2008 but that has been around for much longer. The gist of it is that a malicious webpage can load another site in an iframe, hide it, and move it around following your mouse cursor. That way, when you click on something on the malicious page, you're really clicking on the other site via the iframe. This way, even if the malicious site can't manipulate the other site's contents programmatically, they can still get you do to do things that you don't want to do (e.g. click on a Paypal confirmation link to send them money). Again, this relies on the fact that you're logged in to the other site, so that when you do the click the other site accepts that as a valid action coming from you, the authenticated user. If you weren't logged in to the other site then the possible repercussions are severely reduced.

And that's where the origin-based separation of cookies comes in. In both of these scenarios, there's one site that's the top-level window in your browser, and another site that you're logged in to that's being accessed from that top-level window. The cookies being sent to the logged-in site make both of these vulnerabilities possible. If the cookies don't get sent, then neither of these attacks works.

Now, the solution I'm proposing here is somewhat similar to #6 proposed by Collin Jackson here, except that his solution required a new attribute to be added to the Set-Cookie header, and implemented by websites. The one I'm proposing can be done by a purely browser-based change. Unfortunately I don't think it's doable with just a browser plugin for most browsers, it requires a deeper change than just that.

Of course, there are a couple of problems with this solution. One is that this might affect valid workflows currently in place on the web. If there is a website at origin A that redirects you to a website at origin B to log in, and then redirects you back to origin A, which now includes content from origin B, that won't work anymore. I can't really think of any websites I use on a regular basis that do this, though. The best way to find out is to just implement it and see what breaks. (Although I think site architectures like this are bad in general, and shouldn't really be that much work to fix).

The second problem is the solution doesn't really cover all the ways websites can track you. There are a lot of other things that can be used, such as your IP address, User-Agent header, flash cookies, etc. that could be used to identify you on the web and track which websites you visit. But all of these are solvable as well - you could use Tor for anonymous IP routing, make minor modifications to your UA header on each origin, enforce the same origin-based separation for flash cookies, and so on.

Anyway, that's my two cents for the day. If I have more time later I'll try to implement this, but if anybody else (particularly current browser developers) want to jump on it go right ahead.

Posted by GregT at 2010-12-07 20:11:20

IE9 (which actually works with stakface.com! hooray!) has implemented something like this:
http://blogs.msdn.com/b/ie/archive/2010/12/07/ie9-and-privacy-introducing-tracking-protection-v8.aspx

[ Reply to this ]

Posted by stak at 2010-12-07 21:30:14

Oooh neat! (Both for the tracking protection and for working with this site :)).

Two things though: one is that it's an opt-in approach with a blacklist of sites. This means that every time there's a new site out there that starts gaining mass and can be potentially dangerous I have to add it to the list, which could get annoying. That's not too bad, but it's compounded by the fact that I can't just change it to blacklist every site by default, because a lot of websites do need to request content across domains for legitimate reasons.

For example, if I want to create a TPL for Google, I have then whitelist all sorts of things like the Google-hosted JS libraries that a lot of people are (and should be) using. From the description you linked to, it sounds like Google is the one who is supposed to be creating the TPL and so they would automatically whitelist those libraries. In theory that sounds nice but I think that model (i.e. relying on the website itself to create a TPL) is inherently flawed, since it's not in their best interest to create the TPL.

[ Reply to this ]

Posted by GregT at 2010-12-08 14:41:41

It would be nice if I could provide a link to a TPL that was maintained by a person or organization with the same privacy values as me. Then the TPL is magically curated by those folks while I can continue my lazy ways.

[ Reply to this ]

Posted by stak at 2010-12-08 17:01:52

If only :)

[ Reply to this ]

Posted by kats at 2011-01-26 13:15:51

Via Slashdot: Absuing HTTP Status Codes to Expose Private Information. This is the inverse of the "giant company tracking your every move" - here it's the small random website that's detecting if you're logged in to the giant company website. Regardless, it's another attack that would be fixed with my proposed solution.

[ Reply to this ]

Posted by stak at 2012-12-17 13:07:04

Apparently some people at Mozilla did explore this but it didn't get very far. See bug 565965.

[ Reply to this ]

[ Add a new comment ]