Get ready for machines to make mistakes

In 1994, Intel acknowledged a bug in its Pentium processor that produced incorrect results in some rare circumstances. Intel calculated that the average user would hit this bug once every 27,000 years. Consumers reacted quite negatively to this news, and Intel was eventually forced to recall processors worth some 500 million USD.

Ever since computers were invented we have marveled their perfection. Machines are good at what human’s aren’t: reproducibly repeating billions and billions of calculations, never once making a mistake. We have come to embrace this expectation as a fundamentally held belief: machines are flawless. Intel violated this expectation through its now infamous FDIV bug, and it reaped customers’ wrath for it.

AI is about to do what Intel failed to do in 1994: reshape our expectations in computers.

The past of computers belongs to software and software is deterministic. If you send data through a program, there is a certain output we expect, and we can reason over why that output was produced, or why not.

AI is inherently non-deterministic. One of the strength of AI is that it allows us to solve problems we simply don’t know how to write algorithms for. For decades researchers struggled to write algorithms that recognize handwriting, for example. The advent of modern machine learning solved this problem without actually solving it. We still have no clue how to design an algorithm that recognizes handwriting, but we successfully used machine learning to create models that do so.

The caveat is that AI recognizes handwriting probabilistically. Even a perfectly well shaped letter or number may at any time be completely misread by a model. Well trained models are mostly right, most the time, and well … sometimes randomly wrong.

And thats not a bug. Its simply a fundamental property of AI as a probabilistic data-driven system. The key strength of AI is that it can produce often correct results for data it has never seen before. And this key strength is also its key weakness. There is simply no guarantee that the results will be correct for any particular input. At best we can make probabilistic predictions regarding accuracy.

So are consumers going to revolt against AI making mistakes the way they revolted against the FDIV bug 20 years ago? I don’t think so. The more “human-level” problems (voice recognition, image recognition) AI solves, the easier it becomes for us humans to relate to machines making mistakes. We make similar mistakes ourselves, after all. If Siri misunderstands my voice command in a noisy environment, its hard to get upset about that. Humans sometimes also have to ask again if there is a lot of background noise. Machines solving problems that humans naturally solve and sometimes fail at on a daily basis will help us adapt to this new world where machines make mistakes. To use psychology terminology, its a lot easier to have empathy for AI than for an algorithm.

AI silicon vs AI software

There is a lot of activity in the neural network hardware space. Intel just bought Nervana for $400m and Movidius for an undisclosed amount. Both make dedicated silicon to run and train neural networks. Most other chipset vendors I have talked to are similarly interested in adding direct support for neural networks to future chips. I think there is some risk in this approach. Here is why.

Most of the time executing a neural network is spent in massive matrix operations such as convolution and matrix multiplication. The state of the art is to use GPUs to do these operations because GPUs have a lot of ALUs and are well optimized for massively data parallel tasks. If you spent time optimizing neural networks for GPUs (we do!), you probably know that a well optimized engine achieves about 40-90% efficiency (ALU utilization). Dedicated neural network silicon aims to raise that to 100%, which is great, but in the end a 2x speedup only.

The problematic part is that chipset changes have a long lead time (2-3 years), and you have to commit today to the architectures of tomorrow. And thats where things get tricky. A paper published earlier this year showed that neural networks can be binarized, which reduces the precision of weights to 1 bit (-1, 1). Slow and energy inefficient floating point math turns into very efficient binary math (XNOR), which speeds up the network on existing commodity silicon by 400% with a very small loss in precision. Commodity GPUs support this because they are relatively general purpose computers. Dedicated neural network silicon is much more likely designed for a specific compute mode.

In the last 12 months or so alone we have seen dramatic advances in our understanding how to train and evaluate neural networks. Binarization is just one of them, and its fair to expect that over the next 2-3 years similarly impactful advances will be published. Committing to hardware designs based on today’s understanding of neural networks is ill advised in my opinion. My recommendation to chipset vendors is to beef up their GPUs and DSPs, and support a wide range of fixed point and floating point resolutions in a fairly generic manner. Its more likely that that will cover what we’ll want from silicon over the next 2-3 years.

Brendan is back to save the Web

Brendan is back, and he has a plan to save the Web. Its a big and bold plan, and it may just work. I am pretty excited about this. If you have 5 minutes to read along I’ll explain why I think you should be as well.

The Web is broken

Lets face it, the Web today is a mess. Everywhere we go online we are constantly inundated with annoying ads. Often pages are more ads than content, and the more ads the industry throws at us, the more we ignore them, the more obnoxious ads get, trying to catch our attention. As Brendan explains in his blog post, the browser used to be on the user’s side—we call browsers the user agent for a reason. Part of the early success of Firefox was that it blocked popup ads. But somewhere over the last 10 years of modern Web browsers, browsers lost their way and stopped being the user’s agent alone. Why?

Browsers aren’t free

Making a modern Web browser is not free. It takes hundreds of engineers to make a competitive modern browser engine. Someone has to pay for that, and that someone needs to have a reason to pay for it. Google doesn’t make Chrome for the good of mankind. Google makes Chrome so you can consume more Web and along with it, more Google ads. Each time you click on one, Google makes more money. Chrome is a billion dollar business for Google. And the same is true for pretty much every other browser. Every major browser out there is funded through advertisement. No browser maker can escape this dilemma. Maybe now you understand why no major browser ships with a builtin enabled by default ad-blocker, even though ad-blockers are by far the most popular add-ons.

Our privacy is at stake

It’s not just the unregulated flood of advertisement that needs a solution. Every ad you see is often selected based on sensitive private information advertisement networks have extracted from your browsing behavior through tracking. Remember how the FBI used to track what books Americans read at the library, and it was a big scandal? Today the Googles and Facebooks of the world know almost every site you visit, everything you buy online, and they use this data to target you with advertisement. I am often puzzled why people are so afraid of the NSA spying on us but show so little concern about all the deeply personal data Google and Facebook are amassing about everyone.

Blocking alone doesn’t scale

I wish the solution was as easy as just blocking all ads. There is a lot of great Web content out there: news, entertainment, educational content. It’s not free to make all this content, but we have gotten used to consuming it “for free”. Banning all ads without an alternative mechanism would break the economic backbone of the Web. This dilemma has existed for many years, and the big browser vendors seem to have given up on it. It’s hard to blame them. How do you disrupt the status quo without sawing off the (ad revenue) branch you are sitting on?

It takes an newcomer to fix this mess

I think its unlikely that the incumbent browser vendors will make any bold moves to solve this mess. There is too much money at stake. I am excited to see a startup take a swipe at this problem, because they have little to lose (seed money aside). Brave is getting the user agent back into the game. Browsers have intentionally remained silent onlookers to the ad industry invading users’ privacy. With Brave, Brendan makes the user agent step up and fight for the user as it was always intended to do.

Brave basically consists of two parts: part one blocks third party ad content and tracking signals. Instead of these Brave inserts alternative ad content. Sites can sign up to get a fair share of any ads that Brave displays for them. The big change in comparison to the status quo is that the Brave user agent is in control and can regulate what you see. It’s like a speed limit for advertisement on the Web, with the goal to restore balance and give sites a fair way to monetize while giving the user control through the user agent.

Making money with a better Web

The ironic part of Brave is that its for-profit. Brave can make money by reducing obnoxious ads and protecting your privacy at the same time. If Brave succeeds, it’s going to drain money away from the crappy privacy-invasive obnoxious advertisement world we have today, and publishers and sites will start transacting in the new Brave world that is regulated by the user agent. Brave will take a cut of these transactions. And I think this is key. It aligns the incentives right. The current funding structure of major browsers encourages them to keep things as they are. Brave’s incentive is to bring down the whole diseased temple and usher in a better Web. Exciting.

Quick update: I had a chance to look over the Brave GitHub repo. It looks like the Brave Desktop browser is based on Chromium, not Gecko. Yes, you read that right. Brave is using Google’s rendering engine, not Mozilla’s. Much to write about this one, but it will definitely help Brave “hide” better in the large volume of Chrome users, making it harder for sites to identify and block Brave users. Brave for iOS seems to be a fork of Firefox for iOS, but it manages to block ads (Mozilla says they can’t).

Oracle sinks its claws into Android

This is my first blog post since leaving my role as Mozilla’s CTO 6 months ago. As you may have read in the press, a good chunk of the original Firefox OS founding team has moved on from mobile and we created a startup to work on some cool products and technologies for the Internet of Things. You’ll hear more about what we are up to next month.

While I am no longer working directly on mobile, a curious event got my attention: A commit appeared in the Android code base that indicates that Google is abandoning its own re-implementation of Java in favor of Oracle’s original Java implementation. I’ll try to explain why I think this is a huge change and will have far-reaching implications for Android and the Android ecosystem.

Why did Google create its own Java clone?

To run a Java app, you need a runtime library written in Java called the Java standard classes. This library implements basic language constructs like hash tables or strings.

Since the early days, Android didn’t use Sun’s version of the Java standard classes. Instead, the Android team enhanced the open source Apache Harmony Java standard libraries. Harmony is an independent “clean room” open-source implementation of the Java standard libraries maintained by the Apache Foundation.

There is basically no technical advantage in using Harmony. Its a strictly less complete and less correct version of Sun’s original implementation. Why did Android invest all this effort to duplicate Sun’s open source Java standard classes?

Apache vs GPL

Over the course of Android’s meteoric rise, the powers behind Android demonstrated a deep strategic understanding of different classes of open source licenses, their strength, and their weaknesses. Android has from its early days successfully used open source licenses to enable proprietary technology. Sounds counter-intuitive, but it explains why Google rewrote so much open technology for Android.

Java is actually not the only major open technology piece Google reinvented. Since Android 1.0 Google uses bionic as its standard C library. There were very few strong technical reasons to use bionic over open source alternatives such as the GNU libc. Quite to the contrary, at Mozilla in the earlier days of Android we had to constantly fight deficiencies in bionic in comparison to existing open source standard C libraries. Famously, bionic was not thread safe in many cases, crashing multi-threaded applications.

Writing a standard C library from scratch is crazy. Its one of the most commoditized pieces of software. Its almost impossible to do it significantly better than existing implementations, and it costs a ton of time and money and compatibility is a huge pain. Why did Google do it anyway? There is a simple answer: Licensing.

Bionic (as Google’s Java implementation) is licensed under the non-viral Apache 2 (APL) license. You can use and modify APL code without having to publish the changes. In other words, you can make proprietary changes and improvements. This is not possible with the GNU libc, which is under the LGPL. I am pretty sure I know why Google thought that this is important, because as part of launching Firefox OS I talked to many of the same chipset vendors and OEMs that Google works with. Silicon vendors and OEMs like to differentiate at the software level, trying to improve Android code all over the stack. Especially silicon vendors often modify library code to take advantage of their proprietary silicon, and they don’t want to share these changes with the world. Its their competitive moat–their proprietary advantage. Google rewrote bionic — and Java standard classes — because silicon vendors and OEMs probably demanded that most parts of Android are open (ironic use of the word in this context) to this sort of proprietary approach.

UpdateBob Lee who worked for Google at the time commented below that OpenJDK didn’t exist yet when Android 1.0 launched (2007), so Google couldn’t use OpenJDK back then. GNU Classpath (LGPL) did exist since 2004, however. Google still chose Harmony, and stuck with it even after Apache abandoned the Harmony project in 2011. The switch now is clearly driven by the Oracle vs Google lawsuit.


OpenJDK is the name for Oracle’s Java that you can obtain under the GPL2, a viral open source license with strong protections. Any changes to OpenJDK have to be published as source code (except if you are Oracle).

Because Oracle has means to control Java beyond source code, OpenJDK is about as open as a prison. You can vote on how high the walls are, and you can even help build the walls, but if you are ever forced to walk into it, Oracle alone will decided when and whether you can leave. Oracle owns much of the roadmap of OpenJDK, and via compatibility requirements, trademarks, existing agreements, and API copyright lawsuits (Oracle vs Google) Oracle is pretty much in full control where OpenJDK is headed. What does this mean for Android?

In short: there is a new sheriff in town. The app ecosystem is at the heart of every mobile OS. Its what made Android and iOS successful, and its what made Firefox OS struggle. The app ecosystem rests on the app stack, in Android’s case Harmony in the past, and going forward OpenJDK. In other words, Oracle has now at least one hand at the steering wheel as well.

Its anyone’s guess what Oracle will do with it, but Google and Oracle have a long history of not getting along, so its going to be quite curious to watch. Java itself aims to be a platform, and it is similarly vast in scope as Android. Java includes its own user interface (UI) library Swing, for example. Google has of course its own Android UI framework. Swing will now sit on every Android phone, using up resources. Its unlikely that Oracle will try to force Google to actually use Swing, but Google has to make sure it works and is present and apps can use it. And, Oracle can easily force Google to include pretty much any other code or service that pleases Oracle. How about Java Push Notifications, specified and operated by Oracle? All Oracle has to do is add it to OpenJDK, and it will make its way into Android. Google is now on Oracle’s Hamster wheel.

A rough year ahead

In the short term Google’s biggest challenge will be to rip out Harmony and replace it with OpenJDK. They actually have been working on this for a while. It seems this project started in secret already 11 months ago, and is now being merged into the public repository.

All this code and technology churn will have massive implications for Android at a tactical level. Literally millions of lines of code are changing, and the new OpenJDK implementation will often have subtly different correctness or performance behavior than the Harmony code Google used previously. Here you can see Google updating a test for a specific boundary condition in date handling. Harmony had a different behavior than Oracle’s OpenJDK, and the test had to be fixed.

The app ecosystem runs on top of these shifting sands. The Android app store has millions of apps that rely on the Java standard classes, and just as tests have to be fixed, apps will randomly break due to the subtle changes the OpenJDK transition brings. Breakage will not be limited to correctness. Performance variances will be even harder to track down. Past performance workarounds will be obsolete but it will be hard to tell which ones, and entirely new performance problems will pop up. Fun.

And of course, licensing is changing as you can see here. The core of Android’s ecosystem runtime is now powered by GPL2 library code, copyright Oracle.

I also have a very hard time imagining Android N coming out on time with this magnitude of change happening behind the scenes. Google is changing engines mid-flight. The top priority will be to not crash. They won’t have much time to worry about arriving on schedule.

The winner

No matter how you look at this, this is a huge victory for Oracle. Oracle never had much of a mobile game, and all the sudden Oracle gained a good amount of roadmap and technology influence over the most important mobile ecosystem by scale. Oracle is a mobile titan now. I didn’t see that one coming.

The losers

Google, and silicon vendors. The entire middle part of the Android stack will be subject to proprietary Oracle control. Google calls this “reduced fragmentation” in their press release. That’s true, kind of. There will be less fragmentation because Oracle will control anything Java, including Android.

Silicon vendors will be still allowed to do proprietary enhancements if they obtain the same library code under a difference (non-viral, non-GPL2) license from Oracle — for a fee. Oracle has actually already a history of up-selling OpenJDK. They are offering certain components of the Java VM only for royalty payments. You get a basic garbage collector for free. If you want the really good one, it’ll cost you. I expect Oracle to attempt to monetize in similar ways the billions of mobile users it just stumbled upon.

New Adventure

It is with mixed emotions that I end my almost 7 year long journey with Mozilla next week. Working with this team has been one of the peak experiences of my professional life.

I am also extremely excited about the next chapter in that life. I am departing Mozilla to create a new venture in the Internet of Things space, an open field that presents many of the types of challenges and opportunities that drive our passion for the Web.

I feel deeply humbled and honored that I had the chance to be part of such an amazing and passionate group of people for the last many years, building together the Web we want. I leave with fond memories and great respect for this organization and the people who build it each day. It has been a great honor to be your colleague and friend.

Data is at the heart of search. But who has access to it?

In my February 23 blog post, I gave a brief overview of how search engines have evolved over the years and how today’s search engines learn from past searches to anticipate which results will be most relevant to a given query. This means that who succeeds in the $50 billion search business and who doesn’t mostly depends on who has access to search data. In this blog post, I will explore how search engines have obtained queries in the past and how (and why) that’s changing.

For some 90% of searches, a modern search engine analyzes and learns from past queries, rather than searching the Web itself, to deliver the most relevant results. Most the time, this approach yields better results than full text search. The Web has become so vast that searches often find millions or billions of result pages that are difficult to rank algorithmically.

One important way a search engine obtains data about past queries is by logging and retaining search results from its own users. For a search engine with many users, there’s enough data to learn from and make informed predictions. It’s a different story for a search engine that wants to enter a new market (and thus has no past search data!) or compete in a market where one search engine is very dominant.

In Germany, for example, where Google has over 95% market share, competing search engines don’t have access to adequate past search data to deliver search results that are as relevant as Google’s. And, because their search results aren’t as relevant as Google’s, it’s difficult for them to attract new users. You could call it a vicious circle.

Search engines with small user bases can acquire search traffic by working with large Internet Service providers (also called ISPs, think Comcast, Verizon, etc.) to capture searches that go from users’ browsers to competing search engines. This is one option that was available in the past to Google’s competitors such as Yahoo and Bing as they attempted to become competitive with Google’s results.

In an effort to improve privacy, Google began using encrypted connections to make searches unintelligible to ISPs. One side effect was that an important avenue was blocked for competing search engines to obtain data that would improve their products.

An alternative to working with ISPs is to work with popular content sites to track where visitors are coming from. In Web lingo this is called a “referer header.” When a user clicks on a link, the browser tells the target site where the user was before (what site “referred” the user). If the user was referred by a search result page, that address contains the query string, making it possible to associate the original search with the result link. Because the vast majority of Web traffic goes to a few thousand top sites, it is possible to reconstruct a pretty good model of what people frequently search for and what results they follow.

Until late 2011, that is, when Google began encrypting the query in the referer header. Today, it’s no longer possible for the target site to reconstruct the user’s original query. This is of course good for user privacy—the target site knows only that a user was referred from Google after searching for something. At the same time, though, query encryption also locked out everyone (except Google) from accessing the underlying query data.

This chain of events has led to a “winner take all” situation in search, as a commenter on my previous blog post noted: a successful search engine is likely to get more and more successful, leaving in the dust the competitors who lack access to vital data.

These days, the search box in the browser is essentially the last remaining place where Google’s competitors can access a large volume of search queries. In 2011, Google famously accused Microsoft’s Bing search engine of doing exactly that: logging Google search traffic in Microsoft’s own Internet Explorer browser in order to improve the quality of Bing results. Having almost tripled the market share of Chrome since then, this is something Google has to worry much less about in the future. Its competitors will not be able to use Chrome’s search box to obtain data the way Microsoft did with Internet Explorer in the past.

So, if you have ever wondered why, in most markets, Google’s search results are so much better than their competitors’, don’t assume it’s because Google has a better search engine. The real reason is that Google has access to so much more search data. And, the company has worked diligently over the past few years to make sure it stays that way.

Search works differently than you may think

Search is the main way we all navigate the Web, but it works very differently than you may think. In this blog post I will try to explain how it worked in the past, why it works differently today and what role you play in the process.

The services you use for searching, like Google, Yahoo and Bing, are called a search engines. The very name suggests that they go through a huge index of Web pages to find every one that contains the words you are searching for. 20 years ago search engines indeed worked this way. They would “crawl” the Web and index it, making the content available for text searches.

As the Web grew larger, searches would often find the same word or phrase on more and more pages. This was starting to make search results less and less useful because humans don’t like to read through huge lists to manually find the page that best matches their search. A search for the word “door” on Google, for example, gives you more than 1.9 billion results. It’s impractical — even impossible — for anyone look through all of them to find the most relevant page.


Google finds about 1.9 billion results for the search query “door”.

To help navigate the ever growing Web, search engines introduced algorithms to rank results by their relevance. In 1996, two Stanford graduate students, Larry Page and Sergey Brin, discovered a way to use the information available on the Web itself to rank results. They called it PageRank.

Pages on the Web are connected by links. Each link contains anchor text that explains to readers why they should follow the link. The link itself points to another page that the author of the source page felt was relevant to the anchor text. Page and Brin discovered that they could rank results by analyzing the incoming links to a page and treating each one as a vote for its quality. A result is more likely to be relevant if many links point to it using anchor text that is similar to the search terms. Page and Brin founded a search engine company in 1998 to commercialize the idea: Google.

PageRank worked so well that it completely changed the way people interact with search results. Because PageRank correctly offered the most relevant results at the top of the page, users started to pay less attention to anything below that. This also meant that pages that didn’t appear on top of the results page essentially started to become “invisible”: users stopped finding and visiting them.

To experience the “invisible Web” for yourself, head over to Google and try to look through more than just the first page of results. So few users ever wander beyond the first page that Google doesn’t even bother displaying all the 1.9 billion search results it claims to have found for “door.” Instead, the list just stops at page 63, about a 100 million pages short of what you would have expected.

Despite reporting over 1.9 billion results, in reality Google’s search results for “door” are quite finite and end at page 63.

With publishers and online commerce sites competing for that small number of top search results, a new business was born: search engine optimization (or SEO). There are many different methods of SEO, but the principal goal is to game the PageRank algorithm in your favor by increasing the number of incoming links to your own page and tuning the anchor text. With sites competing for visitors — and billions in online revenue at stake — PageRank eventually lost this arms race. Today, links and anchor text are no longer useful to determine the most relevant results and, as a result, the importance of PageRank has dramatically decreased.

Search engines have since evolved to use machine learning to rank results. People perform 1.2 trillion searches a year on Google alone  — that’s about 3 billion a day and 40,000 a second. Each search becomes part of this massive query stream as the search engine simultaneously “sees” what billions of people are searching for all over the world. For each search, it offers a range of results and remembers which one you considered most relevant. It then uses these past searches to learn what’s most relevant to the average user to provide the most relevant results for future searches.

Machine learning has made text search all but obsolete. Search engines can answer 90% or so of searches by looking at previous search terms and results. They no longer search the Web in most cases — they instead search past searches and respond based on the preferred result of previous users.

This shift from PageRank to machine learning also changed your role in the process. Without your searches — and your choice of results — a search engine couldn’t learn and provide future answers to others. Every time you use a search engine, the search engine uses you to rank its results on a massive scale. That makes you its most important asset.