WebM: An Open Container for the Web
WebM is a multimedia container format designed for the web. It was released by Google in May 2010 as part of the WebM Project, an initiative to establish open, royalty-free video standards for HTML5. The format is based on the Matroska container (the same technology behind MKV files), adapted specifically for web delivery.
WebM files contain video encoded with VP8, VP9, or AV1 codecs and audio encoded with Vorbis or Opus. All of these codecs are open-source and royalty-free, which means anyone can encode, decode, and distribute WebM content without paying patent licensing fees.
Key point: WebM is a container, not a codec. It is the packaging that holds the actual video and audio data. The quality is determined by the codec inside (VP8, VP9, or AV1), not by the WebM container itself.
The History of WebM
WebM's story begins with On2 Technologies, a video codec company that developed the VP series of codecs. Google acquired On2 in February 2010 for $124.6 million, primarily for the VP8 codec. In May 2010, Google open-sourced VP8 under a BSD-style license and announced the WebM format at the Google I/O conference.
The timing was strategic. In 2010, the HTML5 <video> tag was being standardized, but the web had no royalty-free video codec that all browser vendors could agree on. H.264 was patent-encumbered through MPEG LA, making it impossible for free and open-source software to include it without licensing concerns. Google positioned WebM as the open alternative.
Chrome and Firefox adopted WebM support immediately. Opera followed. Microsoft's Internet Explorer and Apple's Safari held out, continuing to support only H.264. The "codec war" of 2010-2015 shaped the web video landscape we have today — and ultimately led to Google developing VP9 (2013) and co-developing AV1 (2018) with the Alliance for Open Media.
Codecs Inside WebM
A WebM file can contain one of three video codecs and one of two audio codecs:
| Codec | Type | Released | Key Feature |
|---|---|---|---|
| VP8 | Video | 2010 | First WebM codec; comparable to H.264 Baseline |
| VP9 | Video | 2013 | ~30% smaller than H.264 at same quality; used by YouTube for 4K |
| AV1 | Video | 2018 | ~30% smaller than VP9; next-generation open codec |
| Vorbis | Audio | 2000 | Open-source lossy audio; paired with VP8 |
| Opus | Audio | 2012 | Best lossy audio codec available; beats AAC at every bitrate |
VP9 is the most widely used video codec in WebM files today. YouTube adopted VP9 in 2014 and uses it to deliver the majority of its content, including all 4K streams. VP9 achieves approximately 30% better compression than H.264 at the same perceived quality, which translates to significant bandwidth savings at YouTube's scale.
AV1 is the newest generation, developed collaboratively by the Alliance for Open Media (AOM) — a consortium that includes Google, Mozilla, Microsoft, Apple, Amazon, Netflix, and others. AV1 improves on VP9 by another ~30%, but encoding is significantly slower. Browser support for AV1 in WebM is growing rapidly, with Chrome 70+, Firefox 67+, and Edge 79+ all supporting it.
Opus is widely considered the best lossy audio codec in existence. It outperforms AAC, MP3, and Vorbis at every bitrate in blind listening tests. Opus handles everything from low-bitrate speech (6 kbps) to high-fidelity music (510 kbps) with a single codec, making it ideal for web video.
Where WebM Is Used
WebM has become deeply embedded in the web ecosystem:
- YouTube: serves the majority of its video content using VP9 in WebM containers. When you watch a YouTube video in Chrome or Firefox at 4K, you are almost certainly watching WebM.
- Wikipedia / Wikimedia Commons: requires WebM (or Ogg) for all video uploads. MP4 is explicitly not allowed due to patent licensing concerns.
- WebRTC: the real-time communication standard used by Google Meet, Discord, and countless video conferencing apps uses VP8/VP9 as its mandatory video codec, typically in WebM containers.
- Reddit / Imgur: many "GIF" alternatives on Reddit and Imgur are actually silent VP9 WebM videos, which are 10–50x smaller than actual animated GIFs.
- HTML5 <video>: WebM is a first-class format for the HTML5 video element, supported natively by Chrome, Firefox, Edge, and Opera since their earliest versions.
- MediaRecorder API: when web applications record video through the browser (screen recording, webcam capture), most browsers default to the WebM format.
Scale perspective: YouTube alone serves over 1 billion hours of video per day. The majority of this is delivered as VP9 WebM. By viewing time, WebM is arguably the most-consumed video format in the world.
Browser & Device Support
WebM support varies by browser and device:
| Platform | WebM VP9 Support | Notes |
|---|---|---|
| Chrome | Full | Since Chrome 29 (2013) |
| Firefox | Full | Since Firefox 28 (2014) |
| Edge | Full | Since Edge 79 (Chromium-based, 2020) |
| Opera | Full | Since Opera 16 (2013) |
| Safari (macOS) | Full | Since Safari 16.4 (March 2023, macOS Ventura) |
| Safari (iOS) | Full | Since iOS 16.4 (March 2023) |
| Android | Full | Since Android 4.4 (VP8 since 2.3) |
| Smart TVs | Partial | YouTube app uses VP9; native playback varies by manufacturer |
| VLC Player | Full | Cross-platform, bundles VP8/VP9/AV1 decoders |
As of 2026, WebM VP9 is supported by 97%+ of web browsers globally (source: Can I Use). The last major holdout was Safari, which added VP9 support in March 2023. For web developers, WebM is now a safe default format for HTML5 video.
Technical Structure of WebM Files
WebM is a profile (subset) of the Matroska container format. Matroska (MKV) is a very flexible, open-source container that supports virtually any codec combination. WebM restricts this flexibility to a specific set of open codecs:
- EBML Header: the first element in every WebM file, identifying it as a Matroska-based document with the WebM DocType. This is analogous to the
ftypatom in MP4. - Segment: the root element containing all media data, organized into clusters.
- Tracks: metadata describing each track — video resolution, frame rate, codec ID, audio sample rate, and channel count.
- Clusters: groups of encoded video and audio frames with timestamps. Each cluster typically holds 1–5 seconds of media.
- Cues: a seek index for random access, similar to the
moovatom's sample tables in MP4.
WebM files can be configured for streaming by placing the Cues element before the Clusters (analogous to faststart in MP4). This enables the browser to seek within the video without downloading the entire file first.