2. Understanding HTML5 Security

Why do we need video DRM and content security? Why can’t we get just watch content without protection? And why does it need to be part of HTML5?

This has been an on-going struggle, especially in the W3C, to a degree that recently the founder of the Internet, Tim Bernes-Lee, shared his thoughts on support of DRM in HTML5.

The web has to be universal, to function at all. It has to be capable of holding crazy ideas of the moment, but also the well polished ideas of the century. It must be able to handle any language and culture. It must be able to include information of all types, and media of many genres. Included in that universality is that it must be able to support free stuff and for-pay stuff, as they are all part of this world. This means that it is good for the web to be able to include movies, and so for that, it is better for HTML5 to have EME than to not have it. (via Tim Bernes-Lee / w3.org)

With EME content security with HTML5 video is differently structured than Flash.

  • The DRM is securely implemented into the browser itself, and exposed to the page with HTML5 APIs and the Encrypted Media Extensions (EME)
  • HTML content security applies to video and is not controlled by Flash Player, as example CORS and mixed mode content security
  • EME will require content to be served over SSL in the future

Here are the fundamentals.


2.1 DRM with Encrypted Media Extensions (EME)

The W3C defines EME as the following:

EME extends HTMLMediaElement with APIs to control playback of encrypted content.

  • The API supports use cases ranging from simple clear key decryption to high value video. License/key exchange is controlled by the application, facilitating the development of robust playback applications supporting a range of content decryption and protection technologies.
  • Content Decryption Module (CDM) is the client component that provides the functionality, including decryption, for one or more Key Systems.

Figure 3) Stack Overview EME (via W3C https://dvcs.w3.org/hg/html-media/raw-file/eme-v0.1/encrypted-media/encrypted-media.html)

Two functions worth remembering are keySystemAccess.createMediaKeys() to create the keys, and HTMLMediaElement.setMediaKeys() to associate the keys with MSE.

EME is required for full-DRM content protection.


2.2 HLS AES 128 Encryption with Clear-Key 

When first looking at the transition to HTML5, using an existing format without any packaging and backend changes is desirable in terms of costs and reduction of complexity. Since iOS only supports HLS natively, content is already available in HLS. But does HLS work with HTML5 MSE?

MSE and EME is targeting MPEG-DASH as the format, but the functionality to parse manifests and video containers is implemented in JavaScript, which means JavaScript can also parse HLS and TS segments. Combined with segment transmuxing from the TS container format to fMP4, it is possible use HLS with MSE.

This is different for EME and DRM protected content, since EME with the majority of browser embedded DRMs, requires fMP4, and transmuxing of DRM protected content is not possible in the JavaScript.  In the next part, am going to go deeper on the different DRM container formats and deployment options, since it deserves a section by itself.

To support HLS with encryption and clear key (standard HLS security, not DRM), it cannot use EME and needs to be fully implemented in JavaScript. A good implementation can reach the feature level of the native iOS player, with the caveat that the security is not as strong as a DRM solution.


2.3 CORS (Cross-origin resource sharing)

CORS is the equivalent to crossdomain.xml in Flash and required to allow cross domain communication. This is common for video, since often the domain the player is served from is different from the CDN used for the video content. It’s also important for advertisement, which requires constant communication to third party servers.

A difference between CORS and crossdomain.xml is that CORS is set in the HTTP response header, while crossdomain.xml is a file on the server delivery over HTTP.

“The CORS standard describes new HTTP headers which provide browsers and servers a way to request remote URLs only when they have permission.

  1. The browser sends the OPTIONS request with an Origin HTTP header. The value of this header is the domain that served the parent page.
  2. The server at service.example.com may respond with:
    • An Access-Control-Allow-Origin (ACAO) header in its response indicating which origin sites are allowed. For example Access-Control-Allow-Origin: http://www.example.com
    • An error page if the server does not allow the cross-origin request
    • An Access-Control-Allow-Origin (ACAO) header with a wildcard that allows all domains: Access-Control-Allow-Origin: * “

(via Wikipedia)

When content doesn’t play, it’s worthwhile to check CORS first. The Javascript console will display an error if content doesn’t play because of missing CORS headers.


2.4 Mixed Content Security (HTTPs / HTTP)

Browsers don’t want websites to load HTTP content into HTTPs (SSL) websites.

There are three common responses:

  • Will display an error to the user
  • Will not load the content
  • Will try to upgrade the content from HTTP to HTTPs (requires an additional tag to bet set in Chrome).

This is impactful when operating a web player on a HTTPs (SSL) website, and trying to load content from an unsecured HTTP server. CDNs haven’t by default adopted SSL security for video content delivery, and enabling it can lead to higher content hosting costs.

To test the behavior of different browsers, Qualys SSL Labs has a good testing tool. Here are the results for Chrome 56:

The security rules around mixed mode content tend to becomes stricter, therefore it’s important to consider this when migrating to HTML5 video.


2.5 withCredentials

Another common method of content protection with streaming video is to pass additional information to servers with headers, as example an authentication token. HLS key servers often require this, but this can be related to any kind of request.

Calling video manifest and segments is done with Javascript with the HTML5 function XMLHttpRequest. By default, it does not send this information.

The XMLHttpRequest.withCredentials property indicates whether or not cross-site Access-Control requests should be made using credentials such as cookies, authorization headers or TLS client certificates.

var xhr = new XMLHttpRequest();

xhr.open('GET', 'http://example.com/', true);

xhr.withCredentials = true;


Cookie headers can be enabled withCredentials, but it not allowed to work if Access-Control-Allow-Origin is set to “*” instead of the contents of the Origin request header.


2.6 Summary

HTML5 video and content security is very different from Flash. It requires an intelligent player and likely configuration changes to the infrastructure. In addition, it might require packaging changes. In the next part, I am going to describe formats, packaging, and the challenges + solutions that come with it.


Part 1: HTML5 Basics

Part 2: Understanding HTML5 Security

Part 3: HTML5 Deployment Best Practices – Multi-DRM, Ad Insertion, and Cross-Device Optimizations.