HTML5 reaches the Recommendation stage

Today HTML5 reached the Recommendation stage inside the W3C, the last stage of W3C standards. Mozilla was one of the first organizations to become deeply involved in the evolution and standardization of HTML5, so today’s announcement by the W3C has a special connection to Mozilla’s mission and our work over the last 10 years.

Mozilla has pioneered many widely adopted technologies such as WebGL which further enhance HTML5 and make it a competitive and compelling alternative to proprietary and native ecosystems. With the entrance of Firefox OS into the smartphone market we have also made great progress in advancing the state of the mobile Web. Many of the new APIs and capabilities we have proposed in the context of Firefox OS are currently going through the standards process, bringing capabilities to the Web that were previously only available to native applications.

W3C Standards go through a series of steps, ranging from proposals to Editors’ Drafts to Candidate Recommendations and ultimately Recommendations. While reaching the Recommendation stage is an important milestone, we encourage developers to engage with new Web standards long before they actually hit that point. To stay current, Web developers should keep an eye on new evolving standards and read Editors’ Drafts instead of Recommendations. Web developer-targeted documentation such as developer.mozilla.org and caniuse.com are also a great way to learn about upcoming standards.

A second important area of focus for Mozilla around HTML5 has been test suites. Test suites can be used by Web developers and Web engine developers alike to verify that Web browsers consistently implement the HTML5 specification. You can check out the latest results at:

http://w3c.github.io/test-results/dom/all.html
http://w3c.github.io/test-results/html/details.html

These automated testing suites for HTML5 play a critical role in ensuring a uniform and consistent Web experience for users.

At Mozilla, we envision a Web which can do anything you can do in a native application. The advancement of HTML5 marks an important step on the road to this vision. We have many exciting things planned for our upcoming 10th anniversary of Firefox (#Fx10), which will continue to move the Web forward as an open ecosystem and platform for innovation.

Stay tuned.

OpenH264 Now in Firefox

The Web is an open ecosystem, generally free of proprietary control and technologies—except for video.

Today in collaboration with Cisco we are shipping support for H.264 in our WebRTC implementation. Mozilla has always been an advocate for an open Web without proprietary controls and technologies. Unfortunately, no royalty-free codec has managed to get enough adoption to become a serious competitor to H.264. Mozilla continues to support the VP8 video format, but we feel that VP8 has failed to gain sufficient adoption to replace H.264. Firefox users are best served if we offer a video codec in WebRTC that maximises interoperability, and since much existing telecommunication infrastructure uses H.264 we think this step makes sense.

The way we have structured support for H.264 with Cisco is quite interesting and noteworthy. Because H.264 implementations are subject to a royalty bearing patent license and Mozilla is an open source project, we are unable to ship H.264 in Firefox directly. We want anyone to be able to distribute Firefox without paying the MPEG LA.

Instead, Cisco has agreed to distribute OpenH264, a free H.264 codec plugin that Firefox downloads directly from Cisco. Cisco has published the source code of OpenH264 on Github and Mozilla and Cisco have established a process by which the binary is verified as having been built from the publicly available source, thereby enhancing the transparency and trustworthiness of the system.

OpenH264

OpenH264 is not limited to Firefox. Other Internet-connected applications can rely on it as well.

Here is how Jonathan Rosenberg, Cisco’s Chief Technology Officer for Collaboration, described today’s milestone: “Cisco is excited to see OpenH264 become available to Firefox users, who will then benefit from interoperability with the millions of video communications devices in production that support H.264”.

We will continue to work on fully open codecs and alternatives to H.264 (such as Daala), but for the time being we think that OpenH264 is a significant victory for the open Web because it allows any Internet-connected application to use the most popular video format. And while OpenH264 is not truly open, at least it is the most open widely used video codec.

Note: Firefox currently uses OpenH264 only for WebRTC and not for the <video> tag, because OpenH264 does not yet support the high profile format frequently used for streaming video. We will reconsider this once support has been added.

Improving JPEG image encoding

Images are a big proportion of the data that browsers load when displaying a website, so better image compression goes a long way towards displaying content faster. Over the last few years there has been debate on whether a new image format is needed over the ubiquitous JPEG to provide better image data compression.

We published a study last year which compares JPEG with a number of more recent image formats, including WebP. Since then, we have expanded and updated that study. We did not find that WebP or any other royalty-free format we tested offers sufficient improvements over JPEG to justify the high maintenance cost of adding a new image format to the Web.

As an alternative we recently started an effort to improve the state of the art of JPEG encoders. Our research team released version 2.0 of this enhanced JPEG encoder, mozjpeg today. mozjpeg reduces the size of both baseline and progressive JPEGs by 5% on average, with many images showing significantly larger reductions.

Facebook announced today that they are testing mozjpeg 2.0 to improve the compression of images on facebook.com. It has also donated $60,000 to contribute to the ongoing development of the technology, including the next iteration, mozjpeg 3.0.

“Facebook supports the work Mozilla has done in building a JPEG encoder that can create smaller JPEGs without compromising the visual quality of photos,” said Stacy Kerkela, software engineering manager at Facebook. “We look forward to seeing the potential benefits mozjpeg 2.0 might bring in optimizing images and creating an improved experience for people to share and connect on Facebook.”

mozjpeg improves image encoding while maintaining full backwards compatibility with existing JPEG decoders. This is very significant because any browser can immediately benefit from these improvements without having to adopt new image formats, such as WebP.

The JPEG format continues to evolve along with the Web, and mozjpeg 2.0 will make it easier than ever for users to enjoy those images. Check out the Mozilla Research blog post for all the details.

Reconciling Mozilla’s Mission and the W3C EME

With most competing browsers and the content industry embracing the W3C EME specification, Mozilla has little choice but to implement EME as well so our users can continue to access all content they want to enjoy. Read on for some background on how we got here, and details of our implementation.

Digital Rights Management (DRM) is a tricky issue. On the one hand content owners argue that they should have the technical ability to control how users share content in order to enforce copyright restrictions. On the other hand, the current generation of DRM is often overly burdensome for users and restricts users from lawful and reasonable use cases such as buying content on one device and trying to consume it on another.

DRM and the Web are no strangers. Most desktop users have plugins such as Adobe Flash and Microsoft Silverlight installed. Both have contained DRM for many years, and websites traditionally use plugins to play restricted content.

In 2013 Google and Microsoft partnered with a number of content providers including Netflix to propose a “built-in” DRM extension for the Web: the W3C Encrypted Media Extensions (EME).

The W3C EME specification defines how to play back such content using the HTML5 <video> element, utilizing a Content Decryption Module (CDM) that implements DRM functionality directly in the Web stack. The W3C EME specification only describes the JavaScript APIs to access the CDM. The CDM itself is proprietary and is not specified in detail in the EME specification, which has been widely criticized by many, including Mozilla.

Mozilla believes in an open Web that centers around the user and puts them in control of their online experience. Many traditional DRM schemes are challenging because they go against this principle and remove control from the user and yield it to the content industry. Instead of DRM schemes that limit how users can access content they purchased across devices we have long advocated for more modern approaches to managing content distribution such as watermarking. Watermarking works by tagging the media stream with the user’s identity. This discourages copyright infringement without interfering with lawful sharing of content, for example between different devices of the same user.

Mozilla would have preferred to see the content industry move away from locking content to a specific device (so called node-locking), and worked to provide alternatives.

Instead, this approach has now been enshrined in the W3C EME specification. With Google and Microsoft shipping W3C EME and content providers moving over their content from plugins to W3C EME Firefox users are at risk of not being able to access DRM restricted content (e.g. Netflix, Amazon Video, Hulu), which can make up more than 30% of the downstream traffic in North America.

We have come to the point where Mozilla not implementing the W3C EME specification means that Firefox users have to switch to other browsers to watch content restricted by DRM.

This makes it difficult for Mozilla to ignore the ongoing changes in the DRM landscape. Firefox should help users get access to the content they want to enjoy, even if Mozilla philosophically opposes the restrictions certain content owners attach to their content.

As a result we have decided to implement the W3C EME specification in our products, starting with Firefox for Desktop. This is a difficult and uncomfortable step for us given our vision of a completely open Web, but it also gives us the opportunity to actually shape the DRM space and be an advocate for our users and their rights in this debate. The existing W3C EME systems Google and Microsoft are shipping are not open source and lack transparency for the user, two traits which we believe are essential to creating a trustworthy Web.

The W3C EME specification uses a Content Decryption Module (CDM) to facilitate the playback of restricted content. Since the purpose of the CDM is to defy scrutiny and modification by the user, the CDM cannot be open source by design in the EME architecture. For security, privacy and transparency reasons this is deeply concerning.

From the security perspective, for Mozilla it is essential that all code in the browser is open so that users and security researchers can see and audit the code. DRM systems explicitly rely on the source code not being available. In addition, DRM systems also often have unfavorable privacy properties. To lock content to the device DRM systems commonly use “fingerprinting” (collecting identifiable information about the user’s device) and with the poor transparency of proprietary native code it’s often hard to tell how much of this fingerprinting information is leaked to the server.

We have designed an implementation of the W3C EME specification that satisfies the requirements of the content industry while attempting to give users as much control and transparency as possible. Due to the architecture of the W3C EME specification we are forced to utilize a proprietary closed-source CDM as well. Mozilla selected Adobe to supply this CDM for Firefox because Adobe has contracts with major content providers that will allow Firefox to play restricted content via the Adobe CDM.

Firefox does not load this module directly. Instead, we wrap it into an open-source sandbox. In our implementation, the CDM will have no access to the user’s hard drive or the network. Instead, the sandbox will provide the CDM only with communication mechanism with Firefox for receiving encrypted data and for displaying the results.

CDM and sandbox architecture

Traditionally, to implement node-locking DRM systems collect identifiable information about the user’s device and will refuse to play back the content if the content or the CDM are moved to a different device.

By contrast, in Firefox the sandbox prohibits the CDM from fingerprinting the user’s device. Instead, the CDM asks the sandbox to supply a per-device unique identifier. This sandbox-generated unique identifier allows the CDM to bind content to a single device as the content industry insists on, but it does so without revealing additional information about the user or the user’s device. In addition, we vary this unique identifier per site (each site is presented a different device identifier) to make it more difficult to track users across sites with this identifier.

Adobe and the content industry can audit our sandbox (as it is open source) to assure themselves that we respect the restrictions they are imposing on us and users, which includes the handling of unique identifiers, limiting the output to streaming and preventing users from saving the content. Mozilla will distribute the sandbox alongside Firefox, and we are working on deterministic builds that will allow developers to use a sandbox compiled on their own machine with the CDM as an alternative. As plugins today, the CDM itself will be distributed by Adobe and will not be included in Firefox. The browser will download the CDM from Adobe and activate it based on user consent.

While we would much prefer a world and a Web without DRM, our users need it to access the content they want. Our integration with the Adobe CDM will let Firefox users access this content while trying to maximize transparency and user control within the limits of the restrictions imposed by the content industry.

There is also a silver lining to the W3C EME specification becoming ubiquitous. With direct support for DRM we are eliminating a major use case of plugins on the Web, and in the near future this should allow us to retire plugins altogether. The Web has evolved to a comprehensive and performant technology platform and no longer depends on native code extensions through plugins.

While the W3C EME-based DRM world is likely to stay with us for a while, we believe that eventually better systems such as watermarking will prevail, because they offer more convenience for the user, which is good for the user, but in the end also good for business. Mozilla will continue to advance technology and standards to help bring about this change.

Technical Leadership at Mozilla

Today, I am starting my role as Mozilla’s new Chief Technology Officer. Mozilla is an unusual organization. We are not just a software company making a product. We are also a global community of people with a shared goal to build and further the Web, the world’s largest and fastest-growing technology ecosystem. My new responsibilities at Mozilla include identifying and enabling new technology ideas from across the project, leading technical decision making, and speaking for Mozilla’s vision of the Web.

I joined Mozilla almost six years ago to work with Brendan Eich and Mike Shaver on a just-in-time compiler for JavaScript based on my dissertation research (TraceMonkey). Originally, this was meant to be a three-month project to explore trace compilation in Firefox, but we quickly realized that we could rapidly bring this new technology to market. On August 23, 2008 Mozilla turned on the TraceMonkey compiler in Firefox, only days before Google launched its then-still-secret Chrome browser, and these two events spawned the JavaScript Performance Wars between Firefox, Chrome and Safari, massively advancing the state of the art in JavaScript performance. Today, JavaScript is one of the fastest dynamic languages in the world, even scaling to demanding use cases like immersive 3D gaming.

The work on TraceMonkey was an eye-opening experience for me. Through our products that are used by hundreds of millions of users, we can bring new technology to the Web at an unprecedented pace, changing the way people use and experience the Web.

Over the past almost six years I’ve enjoyed working for Mozilla as Director of Research and later as Vice President of Mobile and Research and co-founding many of Mozilla’s technology initiatives, including Broadway.js (a video decoder in JavaScript and WebGL), PDF.js (a PDF viewer built with the Web), Shumway (a Flash player built with the Web), the rebooted native Firefox for Android, and of course Firefox OS.

For me, the open Web is a unique ecosystem because no one controls or owns it. No single browser vendor, not even Mozilla, controls the Web. We merely contribute to it. Every browser vendor can prototype new technologies for the Web. Once Mozilla led the way with Firefox, market pressures and open standards quickly forced competitors to implement successful technology as well. The result has been an unprecedented pace of innovation that has already displaced competing proprietary technology ecosystems on the desktop.

We are on the cusp of the same open Web revolution happening in mobile as well, and Mozilla’s goal is to accelerate the advance of mobile by tirelessly pushing the boundaries of what’s possible with the Web. Or, to use the language of Mozilla’s engineers and contributors:

“For Mozilla, anything that the Web can’t do, or anything that the Web is not faster and better at than native technologies, is a bug. We should file it in our Bugzilla system, so we can start writing a patch to fix it.”

Trust but Verify

Background

It is becoming increasingly difficult to trust the privacy properties of software and services we rely on to use the Internet. Governments, companies, groups and individuals may be surveilling us without our knowledge. This is particularly troubling when such surveillance is done by governments under statutes that provide limited court oversight and almost no room for public scrutiny.

As a result of laws in the US and elsewhere, prudent users must interact with Internet services knowing that despite how much any cloud-service company wants to protect privacy, at the end of the day most big companies must comply with the law. The government can legally access user data in ways that might violate the privacy expectations of law-abiding users. Worse, the government may force service operators to enable surveillance (something that seems to have happened in the Lavabit case).

Worst of all, the government can do all of this without users ever finding out about it, due to gag orders.

Implications for Browsers

This creates a significant predicament for privacy and security on the Open Web. Every major browser today is distributed by an organization within reach of surveillance laws. As the Lavabit case suggests, the government may request that browser vendors secretly inject surveillance code into the browsers they distribute to users. We have no information that any browser vendor has ever received such a directive. However, if that were to happen, the public would likely not find out due to gag orders.

The unfortunate consequence is that software vendors — including browser vendors — must not be blindly trusted. Not because such vendors don’t want to protect user privacy. Rather, because a law might force vendors to secretly violate their own principles and do things they don’t want to do.

Why Mozilla is different

Mozilla has one critical advantage over all other browser vendors. Our products are truly open source. Internet Explorer is fully closed-source, and while the rendering engines WebKit and Blink (chromium) are open-source, the Safari and Chrome browsers that use them are not fully open-source. Both contain significant fractions of closed-source code.

Mozilla Firefox in contrast is 100% open source [1]. As Anthony Jones from our New Zealand office pointed out the other month, security researchers can use this fact to verify the executable bits contained in the browsers Mozilla is distributing, by building Firefox from source and comparing the built bits with our official distribution.

This will be the most effective on platforms where we already use open-source compilers to produce the executable, to avoid compiler-level attacks as shown in 1984 by Ken Thompson.

Call to Action

To ensure that no one can inject undetected surveillance code into Firefox, security researchers and organizations should:

  • regularly audit Mozilla source and verified builds by all effective means;
  • establish automated systems to verify official Mozilla builds from source; and
  • raise an alert if the verified bits differ from official bits.

In the best case, we will establish such a verification system at a global scale, with participants from many different geographic regions and political and strategic interests and affiliations.

Security is never “done” — it is a process, not a final rest-state. No silver bullets. All methods have limits. However, open-source auditability cleanly beats the lack of ability to audit source vs. binary.

Through international collaboration of independent entities we can give users the confidence that Firefox cannot be subverted without the world noticing, and offer a browser that verifiably meets users’ privacy expectations.

See bug 885777 to track our work on verifiable builds.

End-to-End Trust

Beyond this first step, can we use such audited browsers as trust anchors, to authenticate fully-audited open-source Internet services? This seems possible in theory. No one has built such a system to our knowledge, but we welcome precedent citations and experience reports, and encourage researchers to collaborate with us.

Brendan Eich, CTO and SVP Engineering, Mozilla
Andreas Gal, VP Mobile and R&D, Mozilla

[1] Firefox on Linux is the best case, because the C/C++ compiler, runtime libraries, and OS kernel are all free and open source software. Note that even on Linux, certain hardware-vendor-supplied system software, e.g., OpenGL drivers, may be closed source.

On management, and culling occluded layers.

My day-time job as VP Engineering for Mobile at Mozilla is to manage an engineering organization of some 130 engineers. Most of my time is spent working with people and helping them to be effective at building software. That means a lot of meetings, a lot of planning, and a lot of HR-type busy work.

My job description today is very different from when I joined Mozilla a little bit more than 5 years ago to build the first available, commercial grade JavaScript compiler at that time (TraceMonkey). After 18 months in that job I became one of the top 3 overall committers to the Mozilla project. My job was to design and build stuff, and I enjoyed that quite a bit.

As much as I miss putting my PhD in Computer Science to use these days, having such a large engineering organization behind me has its perks. The amount of stuff “I” can get done by finding so many engineers the right things to work on and removing any roadblocks they face is really amazing, and it makes up for the lost hacking time.

Plane hacks

Still, to avoid feeling like a useless paper pusher, I usually have a hacking project going on on the side. I can’t really justify spending time hacking at work. There are simply too many meetings to attend, emails to reply to, and problems to solve. However, whenever I sit on a plane without WiFi, I can’t really effectively do my VP job since that job is all about communication. Time on the plane is hacking time, and as it happens, I travel a lot–I obtained Global Services status with United 2 years in a row, almost exclusively flying Economy (that must be some sort of World Record).

When I pick a project to hack on, I tend to pick the hardest problems I come across. Its usually a problem an engineering team is stuck on, and there is no easy fix. Some of my favorite projects last year were PhoneNumber.js, a library to format international phone numbers, and predict.js, a predictive text engine for the FirefoxOS keyboard. Both projects have long graduated into production. JavaScript hacks aside, the other area I am very interested in right now its Graphics and our layers system. Layers are used in browsers to accelerate animations and scrolling using the GPU. Not a lot of people work on this particular code, and its fairly hairy and complex. The perfect playground for a closet-engineer.

Whats a layer anyway?

In Gecko, our rendering engine, we try to detect when frames (a “div” is a frame, for example) are animated, and if so, we put those frames into their own layer. Each layer can be rendered to an independent texture, forming a tree of layers representing the visible part of the document, and we use a compositor to draw those layers into the frame buffer (frame buffer is the technical term used by OpenGL, “window” is what this means on most systems in practice).

Gecko has a couple different kinds of layers. Color layers consist of a single color. The body element of a document is usually white, and we use a color layer to draw that opaque, white rectangle. Image layers can hold one single image and are a special case of content layers (internally called Thebes, for historical reasons). Content layers is where we render arbitrary content into (text, etc).

When rendering the visible part of a document we already try to skip invisible frames, but when frames are animated (are moving around), we often end up having a layer tree where multiple layers are painted on top of each other, partially hiding each other. The compositor draws these layers in Z order, so the result is correct, but we sometimes composite pixels that are guaranteed to be occluded by layers that are pasted right on top of them. On desktop this is wasteful from a power consumption perspective, but in practice usually not a big deal. On mobile, on the other hand, this can actually cause significant performance problems. Mobile systems often have unified memory (texture data and the frame buffer share memory with the CPU) with fairly low memory bandwidth. Overcompositing (drawing pixels that aren’t visible in the end) wastes precious memory bandwidth. In extreme cases this can cause the frame rate to drop below our target frame rate of 60 frames per second for animations.

Flatfish

Flatfish is a tablet we have ported FirefoxOS onto. It has a high resolution screen and a comparatively weak GPU. As a result, over-compositing can cause the frame rate to drop. In case of the home screen for example we were compositing a color layer (blue in the image below) that was completely hidden by a content layer (yellow star). Setting each pixel in the frame buffer to black before copying the actual content over it caused us to miss the 60 FPS target for homescreen animations.

To solve this problem, I wrote a little patch (bug 911471) for the layers system that walks the children of a container layer in reverse Z order and accumulates a region of pixels that are guaranteed to be covered by opaque pixels (some layers might be transparent, those are not added to this region). As we make our way through the list of layers, any pixel that is covered by layers we paint later (remember, we are walking in reverse Z order) we don’t have to actually composite. It would be overwritten by an opaque pixel anyway. We use this information to shrink the scissor rectangle the compositor uses to composite each layer. The scissor rectangle describes the bounds of the OpenGL draw operation we use to composite.

layers

Occluded Color Layer

Not a perfect solution, yet

This approach is not optimal, because the scissor rectangle is just a rectangle, and the layer might be partially occluded. Such partial occlusion is properly described by the region we are accumulating (regions can consist of multiple rectangles), but when setting the scissor rectangle I have to take the bounds of the region to paint (since GL doesn’t support a scissor region). This can still cause over-composition. However, in essentially every test case I have seen this doesn’t matter. Layers tend to be occluded by exactly one other layer, not by a set of layers partially occluding the layer.

It is possible to precisely solve this problem by splitting the actual draw operation into multiple draws with different scissor rects. This might be slow, however, since it writes to the GPU pipeline multiple times. A faster approach is probably to split the draw into multiple quads and draw all of them with one GPU call. Since this is a rare case to begin with, I am not sure we will need this additional optimization. We can always add it later.