New Adventure

It is with mixed emotions that I end my almost 7 year long journey with Mozilla next week. Working with this team has been one of the peak experiences of my professional life.

I am also extremely excited about the next chapter in that life. I am departing Mozilla to create a new venture in the Internet of Things space, an open field that presents many of the types of challenges and opportunities that drive our passion for the Web.

I feel deeply humbled and honored that I had the chance to be part of such an amazing and passionate group of people for the last many years, building together the Web we want. I leave with fond memories and great respect for this organization and the people who build it each day. It has been a great honor to be your colleague and friend.

Data is at the heart of search. But who has access to it?

In my February 23 blog post, I gave a brief overview of how search engines have evolved over the years and how today’s search engines learn from past searches to anticipate which results will be most relevant to a given query. This means that who succeeds in the $50 billion search business and who doesn’t mostly depends on who has access to search data. In this blog post, I will explore how search engines have obtained queries in the past and how (and why) that’s changing.

For some 90% of searches, a modern search engine analyzes and learns from past queries, rather than searching the Web itself, to deliver the most relevant results. Most the time, this approach yields better results than full text search. The Web has become so vast that searches often find millions or billions of result pages that are difficult to rank algorithmically.

One important way a search engine obtains data about past queries is by logging and retaining search results from its own users. For a search engine with many users, there’s enough data to learn from and make informed predictions. It’s a different story for a search engine that wants to enter a new market (and thus has no past search data!) or compete in a market where one search engine is very dominant.

In Germany, for example, where Google has over 95% market share, competing search engines don’t have access to adequate past search data to deliver search results that are as relevant as Google’s. And, because their search results aren’t as relevant as Google’s, it’s difficult for them to attract new users. You could call it a vicious circle.

Search engines with small user bases can acquire search traffic by working with large Internet Service providers (also called ISPs, think Comcast, Verizon, etc.) to capture searches that go from users’ browsers to competing search engines. This is one option that was available in the past to Google’s competitors such as Yahoo and Bing as they attempted to become competitive with Google’s results.

In an effort to improve privacy, Google began using encrypted connections to make searches unintelligible to ISPs. One side effect was that an important avenue was blocked for competing search engines to obtain data that would improve their products.

An alternative to working with ISPs is to work with popular content sites to track where visitors are coming from. In Web lingo this is called a “referer header.” When a user clicks on a link, the browser tells the target site where the user was before (what site “referred” the user). If the user was referred by a search result page, that address contains the query string, making it possible to associate the original search with the result link. Because the vast majority of Web traffic goes to a few thousand top sites, it is possible to reconstruct a pretty good model of what people frequently search for and what results they follow.

Until late 2011, that is, when Google began encrypting the query in the referer header. Today, it’s no longer possible for the target site to reconstruct the user’s original query. This is of course good for user privacy—the target site knows only that a user was referred from Google after searching for something. At the same time, though, query encryption also locked out everyone (except Google) from accessing the underlying query data.

This chain of events has led to a “winner take all” situation in search, as a commenter on my previous blog post noted: a successful search engine is likely to get more and more successful, leaving in the dust the competitors who lack access to vital data.

These days, the search box in the browser is essentially the last remaining place where Google’s competitors can access a large volume of search queries. In 2011, Google famously accused Microsoft’s Bing search engine of doing exactly that: logging Google search traffic in Microsoft’s own Internet Explorer browser in order to improve the quality of Bing results. Having almost tripled the market share of Chrome since then, this is something Google has to worry much less about in the future. Its competitors will not be able to use Chrome’s search box to obtain data the way Microsoft did with Internet Explorer in the past.

So, if you have ever wondered why, in most markets, Google’s search results are so much better than their competitors’, don’t assume it’s because Google has a better search engine. The real reason is that Google has access to so much more search data. And, the company has worked diligently over the past few years to make sure it stays that way.

Search works differently than you may think

Search is the main way we all navigate the Web, but it works very differently than you may think. In this blog post I will try to explain how it worked in the past, why it works differently today and what role you play in the process.

The services you use for searching, like Google, Yahoo and Bing, are called a search engines. The very name suggests that they go through a huge index of Web pages to find every one that contains the words you are searching for. 20 years ago search engines indeed worked this way. They would “crawl” the Web and index it, making the content available for text searches.

As the Web grew larger, searches would often find the same word or phrase on more and more pages. This was starting to make search results less and less useful because humans don’t like to read through huge lists to manually find the page that best matches their search. A search for the word “door” on Google, for example, gives you more than 1.9 billion results. It’s impractical — even impossible — for anyone look through all of them to find the most relevant page.


Google finds about 1.9 billion results for the search query “door”.

To help navigate the ever growing Web, search engines introduced algorithms to rank results by their relevance. In 1996, two Stanford graduate students, Larry Page and Sergey Brin, discovered a way to use the information available on the Web itself to rank results. They called it PageRank.

Pages on the Web are connected by links. Each link contains anchor text that explains to readers why they should follow the link. The link itself points to another page that the author of the source page felt was relevant to the anchor text. Page and Brin discovered that they could rank results by analyzing the incoming links to a page and treating each one as a vote for its quality. A result is more likely to be relevant if many links point to it using anchor text that is similar to the search terms. Page and Brin founded a search engine company in 1998 to commercialize the idea: Google.

PageRank worked so well that it completely changed the way people interact with search results. Because PageRank correctly offered the most relevant results at the top of the page, users started to pay less attention to anything below that. This also meant that pages that didn’t appear on top of the results page essentially started to become “invisible”: users stopped finding and visiting them.

To experience the “invisible Web” for yourself, head over to Google and try to look through more than just the first page of results. So few users ever wander beyond the first page that Google doesn’t even bother displaying all the 1.9 billion search results it claims to have found for “door.” Instead, the list just stops at page 63, about a 100 million pages short of what you would have expected.

Despite reporting over 1.9 billion results, in reality Google’s search results for “door” are quite finite and end at page 63.

With publishers and online commerce sites competing for that small number of top search results, a new business was born: search engine optimization (or SEO). There are many different methods of SEO, but the principal goal is to game the PageRank algorithm in your favor by increasing the number of incoming links to your own page and tuning the anchor text. With sites competing for visitors — and billions in online revenue at stake — PageRank eventually lost this arms race. Today, links and anchor text are no longer useful to determine the most relevant results and, as a result, the importance of PageRank has dramatically decreased.

Search engines have since evolved to use machine learning to rank results. People perform 1.2 trillion searches a year on Google alone  — that’s about 3 billion a day and 40,000 a second. Each search becomes part of this massive query stream as the search engine simultaneously “sees” what billions of people are searching for all over the world. For each search, it offers a range of results and remembers which one you considered most relevant. It then uses these past searches to learn what’s most relevant to the average user to provide the most relevant results for future searches.

Machine learning has made text search all but obsolete. Search engines can answer 90% or so of searches by looking at previous search terms and results. They no longer search the Web in most cases — they instead search past searches and respond based on the preferred result of previous users.

This shift from PageRank to machine learning also changed your role in the process. Without your searches — and your choice of results — a search engine couldn’t learn and provide future answers to others. Every time you use a search engine, the search engine uses you to rank its results on a massive scale. That makes you its most important asset.

WebVR is coming to Firefox Nightly

In 2014 Mozilla started working on adding VR capabilities to the Web. Our VR team proposed a number of new Web APIs and made an experimental VR build of Firefox available that supports rendering VR content using the Web to Oculus Rift headsets.

Consumer VR products are still in a nascent state, but clearly there is great promise for this technology. We have enough confidence in the new APIs we have proposed that we are today taking the step of integrating them into our regular nightly Firefox builds. Head over to MozVR for all the details, and if you own an Oculus Rift headset or mobile VR-capable hardware we support, give it a spin!


It takes many to build the Web we want

Mozilla is announcing today the creation of a WebRTC competency center jointly with Telenor.

Mozilla’s purpose is to build the Web. We do so by building Firefox and Firefox OS. The Web is pretty unusual when it comes to interoperable technology stacks, because it is not built by standards bodies. Instead, the Web is built by browser vendors that implement browsers that implement the Web, which in the end pretty much defines what the Web is.

The Web adds new technologies whenever a majority of browser vendors agree to extend it in an interoperable way. Standards bodies merely help coordinating this process. Very rarely do new Web capabilities originate in a standards body. New Web capabilities merely end up there eventually, once there is sufficient interest by multiple browser vendors to warrant standardization.

Mozilla doesn’t — and can’t — build the Web alone. What makes the Web unique is that it is owned by no-one, and cannot be held back by anyone. It doesn’t take unanimous consent to extend the Web. A mere majority of browser vendors can popularize a new Web capability, forcing the rest of the browser vendors to eventually come along.

While several browser vendors build the Web, Mozilla has a unique vision for the Web that is driven by our mission as a non-profit foundation. Whereas all other browser vendors are for-profit corporations, advancing the Web in the interest of their shareholders, Mozilla advances the Web for users.

The primary browser vendors today are Google, Apple, Microsoft and Mozilla. These four organizations have a direct path to bring new technologies to the Web. While many other technology companies have a strong interest in the Web, they lack the ability to directly move the Web ahead because only these four browser vendors develop a rendering engine that implements the Web stack.

There is one more aspect that sets Mozilla apart from its browser vendor competitors. We are several orders of magnitude smaller than our peers. While this might appear as a market disadvantage at first, combined with our neutral and non-profit status it actually creates a unique opportunity. Many more technology companies have an interest in working on the Web, but if you aren’t Google, Apple, or Microsoft its very difficult to contribute core technologies to the Web. These three companies have direct control over a rendering engine. No other technology company can equally influence the Web. Mozilla is looking to change that.

Jointly with Telenor we are launching a new initiative that will allow parties with a strong technology interest in WebRTC to participate as an equal in the development process of the WebRTC standard. Since standards are really just a result of delivering new Web technologies in a rendering engine, Telenor will assign Telenor engineering staff to work on Mozilla’s implementation of WebRTC in Firefox and Firefox OS.

The goal of this new center is to implement WebRTC with a broad, neutral vision that captures the technology needs of many, not just the technology needs of individual browser vendors.

Mozilla is an open source project where every opinion and technical contribution matters. The WebRTC Competency Center will accelerate the development of WebRTC, and ensure that WebRTC serves the diverse technology interests of many. If you would like to see WebRTC (or any other part of the Web) grow capabilities that are important to you, join us.

Yahoo and Mozilla Form Strategic Partnership

SUNNYVALE, Calif. and MOUNTAIN VIEW, Calif., Wednesday, November 19, 2014 – Yahoo Inc. (NASDAQ: YHOO) and Mozilla Corporation today announced a strategic five-year partnership that makes Yahoo the default search experience for Firefox in the United States on mobile and desktop. The agreement also provides a framework for exploring future product integrations and distribution opportunities to other markets.

The deal represents the most significant partnership for Yahoo in five years. As part of this partnership, Yahoo will introduce an enhanced search experience for U.S. Firefox users which is scheduled to launch in December 2014. It features a clean, modern and immersive design that reflects input from the Mozilla team.

“We’re thrilled to partner with Mozilla. Mozilla is an inspirational industry leader who puts users first and focuses on building forward-leaning, compelling experiences. We’re so proud that they’ve chosen us as their long-term partner in search, and I can’t wait to see what innovations we build together,” said Marissa Mayer, Yahoo CEO. “At Yahoo, we believe deeply in search – it’s an area of investment, opportunity and growth for us. This partnership helps to expand our reach in search and also gives us an opportunity to work closely with Mozilla to find ways to innovate more broadly in search, communications, and digital content.”

“Search is a core part of the online experience for everyone, with Firefox users alone searching the Web more than 100 billion times per year globally,” said Chris Beard, Mozilla CEO. “Our new search strategy doubles down on our commitment to make Firefox a browser for everyone, with more choice and opportunity for innovation. We are excited to partner with Yahoo to bring a new, re-imagined Yahoo search experience to Firefox users in the U.S. featuring the best of the Web, and to explore new innovative search and content experiences together.”

To learn more about this, please visit the Yahoo Corporate Tumblr and the Mozilla blog.

About Yahoo

Yahoo is focused on making the world’s daily habits inspiring and entertaining. By creating highly personalized experiences for our users, we keep people connected to what matters most to them, across devices and around the world. In turn, we create value for advertisers by connecting them with the audiences that build their businesses. Yahoo is headquartered in Sunnyvale, California, and has offices located throughout the Americas, Asia Pacific (APAC) and the Europe, Middle East and Africa (EMEA) regions. For more information, visit the pressroom ( or the Company’s blog (

About Mozilla

Mozilla has been a pioneer and advocate for the Web for more than a decade. We create and promote open standards that enable innovation and advance the Web as a platform for all. Today, hundreds of millions of people worldwide use Mozilla Firefox to discover, experience and connect to the Web on computers, tablets and mobile phones. For more information please visit

Yahoo is registered trademark of Yahoo! Inc. All other names are trademarks and/or registered trademarks of their respective owners.

Firefox and Cisco’s Project Squared

Yesterday I was at Cisco’s Collaboration Summit where Cisco’s CTO for Collaboration Jonathan Rosenberg and I showed Cisco’s new WebRTC-based Project Squared collaboration service running in Firefox, talking to a Cisco Collaboration Desktop endpoint without requiring transcoding.

This demo is the culmination of a year long collaboration between Cisco and Mozilla in the WebRTC space. WebRTC enables voice and video communication directly from within the browser. This means that anyone can build a video conferencing service just using WebRTC and HTML5 standards, without the need for the user to download a plugin or a native application.

Cisco is not only developing WebRTC-based services that run on the Web. They have  also joined a growing number of organizations and companies helping Mozilla to build a better Web. Over the last year Cisco has contributed numerous technical improvements to Mozilla’s WebRTC implementation, including support for screen sharing and the H.264 video codec. These features are now shipping in Firefox. We intend to use them in the future in Mozilla’s own Hello communication service that we are bringing to Firefox.

Cisco’s contributions to the Web go beyond just advancing Firefox. For the last three years the IETF, the standards body defining the networking protocols for WebRTC, has been unable to agree on a mandatory video codec for WebRTC, putting ubiquitous interoperability in doubt.

One of the major blockers to coming to a consensus was that H.264 is subject to royalty-bearing patents, which made it problematic for open source projects such as Firefox to deploy it. To break this logjam, Cisco open-sourced its H.264 code base and made it available in plugin form. Any product  — not just Firefox — can download the plugin and use it to enable H.264 without paying any royalties.

This collaboration between Mozilla and Cisco enabled Firefox to add support for H.264 in WebRTC, and also played a significant role in the compromise reached at the last IETF meeting to adopt both H.264 and VP8 as mandatory video codecs for WebRTC in browsers. As a result of this compromise, in the future all browsers should match the capabilities already available in Firefox.

Mozilla will continue to work on advancing Firefox and the Web, and we are excited to have strong partners like Cisco who share our commitment to the open Web as a shared technology platform.