HTML5 EME is not a DRM standard.

Probably the hottest thing the W3C is working on right now is their Encrypted Media Extension Working Draft. The EME draft is widely talked about as "the DRM standard for HTML5", but this is not truly what it's content covers. I'll look at what it is, why it's not a great idea, and some implications of its approval, were it to be approved.

It's possible to tell what's actually covered by the W3C's draft by carefully examining its title; the second word - "media" - is the key. Your average layman might assume that this refers to media in the general sense (and its oh-so-natural wont for encryption); whereas in fact, it refers very specifically to HTML5's media elements. You might know them as <audio> and <video>. The standard specifies some funky extensions to their DOM/Javascript API, based around cryptographic key management. There is almost nothing here about interesting DRM technology, but there are some warnings that it will introduce yet another way for advertisers to track you. Huzzah.

So where's the DRM?

Good question. It is all hidden away in a little black box called a CDM, or Content Decryption Module. Now keep in mind that according to sources like Ars Technica: "[EME] ...will allow the delivery of DRM-protected media through the browser without the use of plugins such as Flash or Silverlight." (emphasis mine). Here's the definition of a CDM from the draft (Emphasis mine):

The Content Decryption Module (CDM) is a generic term for a part of or add-on to the user agent that provides functionality for one or more Key Systems. Implementations may or may not separate the implementations of CDMs and may or may not treat them as separate from the user agent. This is transparent to the API and application. A user agent may support one or more CDMs.

A "user agent" here basically refers to your browser.

Now forgive me for being incredulous, but these things sound an aweful lot like plugins. And frankly, that's not at all surprising. You can't have a successful DRM system without a closed part. If you want to hide or control something on a user's system, you cannot start off with a blank slate - you need something hidden or obfuscated already there. HTML5 EME can be summarised as laying out an API of Javascript callbacks for retrieving a key. Implementation-wise, a key (or license) is pretty much just a bit of JSON. The following then roughly occurs:

  1. The key is passed into a compatible CDM.
  2. The CDM decrypts your media element's source.
  3. It is sent for playback - either to your browser, or through the operating system directly (already starting to toss cross-platform compatibility to the wind).

These last three steps - they are the interesting ones. They're the start of how we could get a "DRM standard" that can be implemented by all, not just those platforms chosen by Microsoft, Adobe, or your friendly local content provider. Here's how you explain EME's detail of such steps:

Then a miracle occurs.

That's why it's not a standard about DRM. There's no actual standardization of DRM to be seen. If anyone still has any doubts that this then still invokes 'DRM by plugin', here's a convenient Google+ post by Fran├žois Beaufort, on his former company's CDM that Google acquired to implement EME with Netflix.

Take a look at the 'type' of Module it is in the post's picture: PPAPI. This is the same plugin API, initially implemented by Chromium, that Adobe is now uses for Flash player. Also that which Mozilla declined, meaning Firefox is no longer able to run the latest version of Flash on Linux. Google's CDM is therefore using the same plugin protocol on Linux which has caused a major part of the anger and movement away from Flash in the first place. Pluginless, cross-platform DRM is clearly an absolute fallacy.

But... I still want my Netflix!

Ok, sure. Where do you want your Netflix?

*please make sure you're running a recent Ubuntu if you want this to be at all simple.

This isn't helping the web out either. Forgive me for asking, but why exactly do you want swap out a native app experience for playing media in your browser? Is that really an improved experience? In addition to helping close down web standards and making a usable browser implementation even harder to craft and ship, the different browser vendors (both mobile and desktop) still have to:

How these CDMs would be bundled or distributed to everyone is also a completely open question. I wish Mozilla all the best of luck if they want to achieve all this openly and transparently. Are CDM providers going to want to integrate with small software and hardware startups? "Well, do they already have a large enough userbase to make the integration worthwhile?" is what I can hear them asking. Maybe, if there were a standard for all this...

It's open or closed.

There's a good reason that open standards like TCP, HTTP, HTML, XML, JSON, AES and many others, are all wildly successful and have defined much of modern computing. Anyone can communicate using them, anyone can implement them, and anyone can consume them. Not only that, but any one implementation should work with any other. A Cisco router can interpret a Broadcom NIC. A Microsoft browser can decode Node.js JSON responses. Practically everything can parse valid XML. HTML is the universal language of the web - renderable by everyone (well, except for those <audio> and <video> codec fights...). That is the point of open standards - everyone can use them to communicate with everyone, no matter who wrote the code.

EME is a standard that when implemented, can only be assumed to be operable with one CDM implementation, on one browser, on one platform. It explicitly allows for <video> and <audio> to be broken for arbitrary users, no matter what codecs they use. Remember when the web was going to 'just work' no matter what plugins you were missing? Apparently not a great idea anymore.

ChromeOS Laptops already use it - but if you want to get technical, you should be a lot more specific than that. You can view Netflix on your laptop if you're using the TPM module, if your hardware is running the commercial ChromeOS distribution, if you're in the Chrome browser (not Chromium), and if you're using Google's CDM. Otherwise I'm sure a whole lot of code signatures won't match and therefore you can't be trusted to watch House of Cards.

I haven't even begun to touch on what implementing this standard means for the viability of open platforms and software to continue be useful to the average user in the future. Open source software has been pushing technology forwards at an alarming pace, and the EME draft implements technology opposed to that. It's saddening that open distributions like Debian won't be able to take part in helping their users play the very media that Debian servers are distributing. All take and no give may seem like the status quo these days, but it is not in the spirit of the web.

Don't forget the cookie monster!

There is one last juicy bit of detail that surprisingly, the Draft's authors themselves bring up - another method of unavoidably tracking users. Keys aren't the only thing EME is designed to negotiate - user licenses for content are also included. The protocol allows for the browser to communicate both ways: not only ask a server for a key, but also allow the CDM to transmit back a session id. This makes sense in context as you want to be able to track a user's license to content, but it's also easily implemented as another variant of Local Shared Objects (flash cookies). I wouldn't bank on CDMs making sessions easily accessible, manageable or removeable for the user, though - after all, these could effectively be single-user licenses: accessible means shareable. If nothing else, I hope the W3C doesn't implement yet another way for companies to track browsing habits with this draft. At the very least it shows some direct pitfalls of writing black boxes into your specification.


- Matt.