User data in the cloud: Lessons from the Sony debacle

Two weeks ago our friends at Sony managed to get personal information of 70 million users stolen from them. I got one of the notification emails a couple days ago myself (I must have signed up for the Playstation Network when I installed the PS3 we bought to do our Cell VM work back at UCI.) In this instance, Sony is a shining example of how NOT to handle user data.

Whats user data?

User data is any piece of identifiable data about a user. It can be all sorts of obvious stuff like your name, address, birth date, passwords (all of these Sony managed to lose), but also less obvious data such as usage history, communications, and what not. Whether Sony lost the latter is not clear. Some of the information Sony lost clearly should never have been stored in the first place. I understand why Sony asked me for my birth date when I signed up for PSN. Some jurisdictions want you to be a certain age before you can engage in virtual mayhem and violence that is modern video gaming. But why the hell did Sony store this information in a database? Why not just flag my account as “remembers the first techno song being played on the radio, definitely old enough”?

Risks

Storing data is always risky. If you store any data in the cloud, eventually someone will break into it. Investing a lot of money (expensive equipment, expensive practices, expensive staff) helps delaying that day of reconning, but only within limits. There are a lot of financial incentives to steal this kinda of data. Personal information of 70 million people is a fantastic starting point for all sorts of phishing attacks. And even if only one out of 10,000 people getting that Nigerian email falls for it, that’s still plenty of people to take advantage of. Sony took a risk storing user data. Unfortunately, it was not a well calculated risk. They could have easily reduces the fallout from a breach by storing less information, i.e. by not storing the birth date, or maybe even not storing personal information at all! Shocking proposition, I know.

Not knowing is bliss

Why does Sony need the names of its PSN subscribers in the PSN user database to begin with? Let people choose a handle and a password. If you want to personal the experience, let users chose their name. Dear PSN, you may call me Tracemonkey now. You really don’t need to know my real name and address. As for payment information, I think its ok if PSN asks me once a year to re-enter my credit card info, which is then briefly processed but never stored. Had they followed this simple principle of touching (and storing) as little user data as possible, they would have saved themselves a lot of legal trouble and liability.

Browsers

I am not ranting about this topic out of thin air. Web browsers handle a lot of personal information, and its tempting for browser vendors to get in on the whole social networks online transactions online identity game. A number of people at Mozilla don’t seem too comfortable with hosting on our infrastructure any user data ever. Its really scary and risky after all (just ask Sony). I think that’s wrong. We absolutely should get into the key areas of social and identity. Why? Because the state of the art is crappy, and we can do better.

Microsoft Passport anyone?

Web identity is a total mess. I have at least 30 accounts in various places with different account names and passwords (well at least I try). Various organizations and services have tried to established a single login. Microsoft Passport was one of the earlier ones. I am really glad that didn’t work out. Can you imagine the evil empire owning all your personal data and online identity? Microsoft has a lot of incentives to use and abuse such a powerful position, and it certainly does. I still get emails from Microsoft about my Passport account on a regular basis, almost a decade later. Its usually an invitation to try this new Microsoft feature Y or maybe try chatting with my passport account or … well whatever. Microsoft sits on a lot of data, and its tempting to monetize it. And of course its not just them. Everyone else is just as bad. Ever noticed how Google and Facebook customize ads for you based on the data they have about you? Creepy. This is why I think Mozilla can do a lot better. We don’t have any hidden agenda. We don’t have any extra services we want to sell you. We don’t have to monetize data we store to turn a profit and make shareholders happy. We simply don’t have any shareholders. We are a company owned by a non-profit foundation that wants to make the web a better place. This puts us in a much better position to do whats best for our users, instead of whats best for our quarterly statement (we don’t publish any of those in case you didn’t notice.)

Playing it safe

We are currently in the process of figuring out how exactly Mozilla should handle user data. I have exactly zero authority when it comes to these kind of decisions, but Mozilla is a pretty open and democratic place and we tend to discuss this stuff pretty openly, giving anyone a voice who wants to speak up. I think its imperative that we follow a couple guiding principles as we explore ways to better serve our users using services such as identity or social:

  1. Always keep the users’ best interest in mind (and only the users’ best interests). I don’t care if we can ship a feature faster or cheaper if we store more user data (or maybe store it not encrypted instead of encrypted). Our new Sync service is a great example for this. Its a total pain in the butt to encrypt the browser history on the client before it is uploaded to our services, but its the right thing to do. It means that in case of Sync we can never see any of your browser history, even if we tried, and your data is safe by default.
  2. Always store as little data as possible in the cloud. If there is a way to implement a feature completely in the client without us ever having to see user data, that’s always the right approach, even if its harder. This is exactly the issue we are facing with our new F1 social browsing feature. It allows you to share websites on Facebook/Twitter/etc as you visit them. Its a really cool feature–I use it all the time. Unfortunately, the protocol Facebook/Twitter/etc offer to authenticate and access their APIs (oauth) is totally broken, and conceptually doesn’t really work for client applications. oauth requires the client (Firefox/F1) and Facebook/Twitter/etc to negotiate a shared secret (called the consumer key). With a pure client solution this secret can never be kept (someone could peek into Firefox/F1 and extract the key). It seems Facebook has blacklisted consumer keys before because people checked them into open-source repositories. The only alternative to this is to put the key behind a service Mozilla runs and then let Firefox/F1 post via that service, but that means we would be able to see (in theory, not intentionally of course) all the Facebook/Twitter/etc status updates of millions of people every day. That’s wrong. As tempting and quick as it would be to setup a Mozilla service that keeps the key hidden and posts for users, we should never put ourselves in a position where we handle user data without an overwhelming need for it. In this particular instance we should simply negotiate with Facebook/Twitter/etc to not enforce the shared secret rule (Twitter already doesn’t it seems, since there are so many Twitter client apps out there), and maybe in parallel we should work on better protocols than oauth.

Going fast

As we were discussing these various architectural aspects of how to handle user data (or how not to handle it) the last few weeks, some people were tempted to go the easy route and store a lot more user data (in particular in the clear) than necessary because it might get us to market faster. I think this is wrong for the above two principles, but its also wrong because it will NOT get us to market faster. Dealing with user data from an infrastructure perspective is a total pain. To handle or even store things like Facebook/Twitter/etc account authentication tokens or user contacts, we have to build out a serious security infrastructure. We need to hire expensive, highly trained personnel and we have to seriously tighten our security practices. That doesn’t mean we are unsafe right now. It just means our current practices match our current threat scenarios. For example we have external IT administrators who don’t even work for Mozilla administering our source code repository access controls. They are simply Mozilla project volunteers. Considering the limited risks, this is acceptable. When it comes to storing user data, entirely different standards will be needed. And getting all that sorted out and implemented will require a lot of audits, careful planning … and a lot of time. So if you want to go fast, go zero user data. Or encrypt the user data on the client so all we get to see are blobs of meaningless zeros and ones. That’s how Sync works, and we got it up and running within a couple months. That’s how you go fast.

What’s next?

We are currently discussing what our process will be to store user data. Expect people with actual authority to make decisions (and to talk about them) to start talking about this publicly in a few weeks. I already know that the result of our internal deliberations will be a policy that will focus on what’s best for our users, and that will minimize risks for them (and in the end, for us). And expect a safe and secure implementation of F1 to show up in your browser really soon. You can already try out the prototype now. It really rocks.

This blog post represents my personal opinion, not the official position of Mozilla.

Compartments

Heap

We have implemented a major change to the way Firefox manages JavaScript objects. JavaScript objects include script-instantiated objects such as Arrays or Date objects, but also include JavaScript representations of Document Object Model (DOM) elements, such as input fields or DIV elements. In the past, Firefox held all JavaScript objects in a single JavaScript heap. This heap is occasionally garbage collected, which means the browser walks the entire object graph in the heap and determines which objects are still reachable and which are not. Unreachable objects are de-allocated and space is reclaimed.

Firefox 3.6 single heap model.

Having all JavaScript objects in the browser congregate in a single heap is suboptimal for a number of reasons. If a user has multiple windows (or tabs) open, and one of these windows (or tabs) created a lot of objects, it is likely that many of these objects are no longer reachable (garbage). When the browser detects such a state, it will initiate a garbage collection. Unfortunately though, since objects from different windows (or tabs) are intermixed in the heap, the browser has to walk the entire heap. If a number of idle windows are open, this can be quite wasteful, since those windows haven’t really created any garbage, so whenever a window with heavy activity triggers a garbage collection, much of the garbage collection time is spent walking unrelated parts of the global object graph.

In Firefox this problem is even more pronounced than in other browsers, because our UI code (also called chrome code, not to be confused with Google Chrome) is implemented in JavaScript, and there are a lot of chrome (UI) objects alive at any given moment. These UI objects tend to stick around and every time a web content window causes a garbage collection, Firefox spends a lot of time figuring out whether chrome objects are still alive instead of being able to focus on the active web content window.

FIrefox 4 Compartmentalized JavaScript Heaps

Compartments

For Firefox 4 we changed the way JavaScript objects are managed. Our JavaScript engine SpiderMonkey (sometimes also called TraceMonkey and J├ĄgerMonkey, which are SpiderMonkey’s trace-compilation and baseline Just-in-Time compilers) now supports multiple JavaScript heaps, which we also call compartments. All objects that belong to a certain origin (such as “http://mail.google.com/” or “http://www.bank.com/”) are placed into a separate compartment. This has a couple very important implications.

  • All objects created by a website reside within the same compartment and hence are located in the same memory region. This improves cache utilization by reducing false sharing of cache lines. False sharing occurs when we are trying to operate on an object and we have to read an entire cache line of data into the CPU cache. In the old model JavaScript objects could be co-located with arbitrary other JavaScript objects from other origins. Such cross origin objects are used together very infrequently, which reduces the number of cache hits we get. In the new model most objects touched by a website are tightly packed next to each other in memory, with no cross origin objects in between.
  • JavaScript objects (including JavaScript functions, which are objects as well) are only allowed to touch objects in the same compartment. This invariant is very useful for security purposes. The JavaScript engine enforces this requirement at a very low level. It means that a “google.com” object can never accidentally leak into an untrusted website such as “evil.com”. Only a special object type can cross compartment boundaries. We call these objects cross compartment wrappers. We track the creation of these cross compartment wrappers, and thus the JavaScript engine knows at all times what objects from a compartment are kept alive by outside references (through cross compartment wrappers). This allows us to garbage collect individual compartments, in addition to a global collection. We simply assume all objects referenced from outside the compartment to be live, and then walk the object graph inside the compartment. Objects that are found to be disconnected from the graph are discarded. With this new per-compartment garbage collection we shortcut having to walk unrelated heap areas of a window (or tab) that triggered a garbage collection.

Wrappers

Wrappers are not a new concept in Firefox, or browsers in general. In the past we have already used them to regulate how windows (or tabs) pass objects to each other. In the past, when another window (or tab or iframe) tried to touch an object that belongs to a different window, we handed it a wrapper object instead. That wrapper object dynamically checks at access time whether the accessor window (also called the subject) is permitted to access the target object. If one Google Mail window is trying to access another Google Mail window, the access is permitted, because these two windows (or tabs or iframes) are same origin and hence its safe to permit this access. If an untrusted website obtains a reference to a Google Mail DOM element, we hand it the same wrapper, and if it ever tries to access the Google Mail DOM Element the wrapper will at access time deny the property access because the untrusted website “evil.com” is cross origin with “google.com”.

Firefox 3.6 Shared Wrappers

A disadvantage of the Firefox 3.6 wrapper approach (which is similar to the way other browsers utilize wrappers) was the fact that these wrappers had to be injected manually at the right places in the C++ browser implementation, and each wrapper had to do a dynamic security check at access time. With compartments we can do a lot better:

  • Since all objects belonging to the same origin are within the same compartment, and no object from a different origin is in that compartment, we can let all objects within a compartment touch other objects in the same compartment without a wrapper in between. Keep in mind that this doesn’t just apply to windows but also to iframes. A single Google Mail session often uses dozens of iframes that all heavily exchange objects with each other. In the past we had to inject wrappers in between that kept performing dynamic security checks. This is no longer necessary, and there is an observable speedup when using iframe heavy web applications such as Google Mail.
  • Since all cross origin objects are in a different compartment, any cross origin access that needs to perform a security check can only happen through a cross compartment wrapper. Such a cross compartment wrapper always lives in a source compartment, and accesses a single destination object. When we create a cross compartment wrapper, we consult with the wrapper factory to see what kind of security policy should be applied. When “evil.com” obtains a reference to a “google.com” object, for example, we have to create a wrapper to that object in the “evil.com” compartment. When that wrapper is created the wrapper factory will tell us to apply a stringent cross origin security policy, which makes it impossible for “evil.com” to glean information from the “google.com” window. In contrast to our old wrappers, this security policy is static. Since only “evil.com” objects ever see this wrapper, and it only points to one single DOM element in the destination compartment, the policy doesn’t have to be re-checked at access time. Instead, every time “evil.com” attempts to read information from the DOM element, the access is denied without even comparing the two origins.

Firefox 4 Cross Compartment Wrapper

Brain Transplants

A particularly interesting oddity of the JavaScript DOM representation is the existence of two objects for each DOM window (or tab or iframe), the inner window and the outer window. This split was implemented by web browsers a few years ago to securely deal with windows being navigated to a new URL. When such a navigation occurs, the inner window object inside the outer window is replaced with a new object, whereas the actual reference to window (which is the outer window) remains unchanged. If such a navigation takes the window to a new origin, we allocate the inner window in the appropriate new compartment. This of course creates now the problem that the outer window can possibly no longer directly point to the new inner window, because it is in a different compartment.

We solve this problem through brain transplants. Whenever an outer window navigates, we copy it into the new destination compartment. The object in the old compartment is transformed into a cross compartment wrapper that points to the newly created object in the destination compartment. So the term brain transplants is very appropriate here. We are essentially transplanting the guts of the outer window object into a new object hull in the same compartment we allocated the inner object in.

Processes

Some readers might wonder how compartments compare to per-tab processes as they are used by Google Chrome and Internet Explorer. Compartments are similar in many ways, but also very different. Both processes and compartments shield JavaScript objects against each other. The most important distinction is that processes offer a stronger separation enforced by the processor hardware, while compartments offer a pure software guarantee. However, on the upside compartments allow much more efficient cross compartment communication that processes code. With compartments cross origin websites can still communicate with each other with a small overhead (governed by certain cross origin access policy), while with processes cross-process JavaScript object access is either impossible or extremely expensive. In a modern browser you will likely see both forms of separation being applied. Two web sites that never have to talk to each other can live in separate processes, while cross origin websites that do want to communicate can use compartments to enhance security and performance.

Future

We have landed the main compartments patch and current nightly builds (and Beta 7) are running with per-origin compartment JavaScript heaps. Some of the functionality described above will not ship until Beta 8, most importantly per-compartment garbage collections. Those currently still happen for all compartments at once. The foundation we laid with the compartments work will also enable a number of future extensions. Since we now cleanly separate objects belonging to different tabs, future changes to our JavaScript engine will permit us to not only perform JavaScript garbage collection for individual compartments, but we will also be able to do so in the background on a different thread for tabs with inactive content (i.e. no event handler is firing at the moment).

Narcissus/Zaphod JavaScript Research VM for Firefox 4

Zaphod Beeblebrox

Zaphod

Our research intern Tom Austin released the first version of the Narcissus JavaScript Virtual Machine for Firefox 4. Narcissus is a JavaScript virtual machine written in JavaScript. Tom’s Firefox extension Zaphod allows using Narcissus as the default JavaScript engine in Firefox 4. This opens up the world of JavaScript language and virtual machine research to JavaScript programmers. It is no longer necessary to modify complex C++ code to implement new prototype language features for JavaScript (i.e. modules, type annotations, etc.). Similarly, Narcissus/Zaphod can also be used to try out new JavaScript optimizations and static analysis. Since Zaphod runs directly in Firefox, such experiments are no longer limited to simple command line JavaScript shell test cases. Zaphod can run complex websites and all the JS code on them.

Stay tuned for future updates to Zaphod and Narcissus. We have bold plans for both. Our static analysis pass for Narcissus will soon be integrated with Zaphod, as well as our new Static Single Assignment-form Narcissus AST/intermediate representation.

Want to work on cool research projects in the JavaScript/Web space? Join Mozilla Research as research intern. We are looking for highly talented PhD students for Winter 2010/2011 and Summer 2011. Contact me at gal@mozilla.com.