Internet-Draft MKA Stem February 2026
Whited Expires 1 September 2026 [Page]
Workgroup:
Internet Engineering Task Force
Internet-Draft:
draft-swhited-mka-stems-03
Published:
Intended Status:
Informational
Expires:
Author:
ssw. Whited, Ed.

Matroska Stem Files

Abstract

This document defines a multi-track profile of the Matroska container format for storing stems that is also backwards compatible with existing media players.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 1 September 2026.

Table of Contents

1. Introduction

Stem are recordings of individual instruments, or clusters of instruments, used by DJs and music producers for live mixing of music. Historically stem files have been stored as individual audio files, or using patent-encumbered or vendor specific proprietary container formats. The Matroska container format formally specified in [RFC9559] is ideally situated as a container for stems. This specification documents a profile for the Matroska container format that allows it to store lossless or lossy stems as well as metadata about the stems for use in DJ applications.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Requirements

STEM files have a few basic requirements:

3. Track Layout

3.1. Audio Streams

Each stem file may contain an arbitrary number of tracks containing audio and MUST include at least three tracks (the mixed audio and at least two stems). Each track SHOULD be encoded using the same codec with the same parameters including bitrate, channel number, channel layout, and sample rate.

The first track containing audio data MUST be the final post-mix audio in the default language. All tracks containing the final post-mix audio regardless of language MUST have the Matroska "Default" flag set to "1" ([RFC9559], Section 18.1, 5.1.4.1.5). This helps preserve backwards compatibility in media players which do not support this format which typically play the first audio stream found or may select based on the default flag. In addition, the "Enabled" flag for the main track MUST be set to "1" ([RFC9559], Section 5.1.4.1.4).

The remaining tracks will be individual stems and MUST have the same effective length as the first track such that playing each stem track from the beginning would result in the same audio (excluding mastering) as the final mix present in the first track. For example, if the original track is 3 minutes long and the stem file includes a percussion track but the percussion does not start until minute 2 the percussion stem would still be 3 minutes long but would contain a minute of silence at the start of the track, or would have a block timestamp ([RFC9559], Section 10) that starts it at 1 minute.

Each stem track MUST NOT have the Matroska "Default" flag ([RFC9559], Section 18.1, 5.1.4.1.5) set to "1" (it MAY be "0" or unset) and MUST have the "Enabled" flag ([RFC9559], Section 5.1.4.1.4) set to "0".

The stem tracks SHOULD NOT have any gain normalization applied. Instead they should retain the same levels as they would have in the final mix present in the first track so that if all stems were played at unity gain the levels would be equivalent to the final mix.

Each stem track (ie. all tracks that are not the first track) MUST set the value of the \Segment\Tracks\TrackEntry\Name field to a human-readable track name for the stem, for example "Percussion" or "Vocals".

For each stem track a \Segment\Tags\Tag must also be set with its target set to the stem track. The tag MUST contain a SimpleTag element with the TagName field set to "STEM_COLOR" and the TagString field set to a color representing the track in RGB hex format (ie. "#145374").

4. Digital Signal Processor

Because mastering happens post-mix and the stems are pre-mix audio the stem tracks SHOULD NOT have any mastering steps applied. Instead, metadata for configuring a compressor and limiter SHOULD be included in the file's global metadata as simple tags (see Section 5.1.8.1.2 of [RFC9559]). After mixing, playback applications MAY choose to feed the mix through a Digital Signal Processor configured with the limiter and compressor settings read from the metadata.

Each binary setting for the compressor or limiter is stored as a floating-point number in the 32-bit and 64-bit binary interchange format, as defined in [IEEE_754_2019] with the additional restriction that they are limited to a minimum value of 0.0 and a maximum value of 1.0. Because different DSPs may use different ranges or scales for each value the playback software SHOULD interpret the 0-1 values as a linear scale and map them to the range and scale required by the DSP when configuring the DSP for playback. This may result in a loss of fidelity on some DSPs, but this is deemed an acceptable trade off for stem playback which would not normally be able to have a mastering step at all.

During production of a stem track, vendor specific metadata MAY be embedded in the Matroska file for more accurately configuring a specific DSP, but if such metadata is included the scaled values SHOULD also be present for those without access to the specific DSP used for the track and such metadata MUST select tag names in such a way that they do not conflict with the tag names defined for the generic compressor or limiter.

4.1. Compressor Metadata

Table 1
Tag Name Type Values
COMPRESSOR_ENABLED UTF-8 "TRUE" or "FALSE"
COMPRESSOR_RATIO binary 0.0-1.0
COMPRESSOR_OUTPUT_GAIN binary 0.0-1.0
COMPRESSOR_THRESHOLD binary 0.0-1.0
COMPRESSOR_ATTACK binary 0.0-1.0
COMPRESSOR_INPUT_GAIN binary 0.0-1.0
COMPRESSOR_RELEASE binary 0.0-1.0
COMPRESSOR_HP_CUTOFF binary 0.0-1.0
COMPRESSOR_HP_DRY_WET binary 0.0-1.0

4.2. Limiter Metadata

Table 2
Tag Name Type Values
LIMITER_ENABLED UTF-8 "TRUE" or "FALSE"
LIMITER_RELEASE binary 0.0-1.0
LIMITER_THRESHOLD binary 0.0-1.0
LIMITER_CEILING binary 0.0-1.0

5. IANA Considerations

This memo modifies the "Matroska Tag Names" registry to add the following values:

Table 3
Tag Name Tag Type Reference
STEM_COLOR UTF-8 This document, Section 3.1
COMPRESSOR_ENABLED UTF-8 This document, Section 4.1
COMPRESSOR_RATIO binary This document, Section 4.1
COMPRESSOR_OUTPUT_GAIN binary This document, Section 4.1
COMPRESSOR_THRESHOLD binary This document, Section 4.1
COMPRESSOR_ATTACK binary This document, Section 4.1
COMPRESSOR_INPUT_GAIN binary This document, Section 4.1
COMPRESSOR_RELEASE binary This document, Section 4.1
COMPRESSOR_HP_CUTOFF binary This document, Section 4.1
COMPRESSOR_HP_DRY_WET binary This document, Section 4.1
LIMITER_ENABLED UTF-8 This document, Section 4.2
LIMITER_RELEASE binary This document, Section 4.2
LIMITER_THRESHOLD binary This document, Section 4.2
LIMITER_CEILING binary This document, Section 4.2

6. Security Considerations

This document should not affect the security of the Internet.

7. Normative References

[RFC9559]
Lhomme, S., Bunkus, M., and D. Rice, "Matroska Media Container Format Specification", RFC 9559, DOI 10.17487/RFC9559, , <https://www.rfc-editor.org/info/rfc9559>.

8. Informative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/info/rfc8174>.
[IEEE_754_2019]
IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE IEEE 754-2019, DOI 10.1109/IEEESTD.2019.8766229, , <https://ieeexplore.ieee.org/document/8766229>.

Acknowledgements

Thanks to the members of #matroska on the libera.chat IRC network for patiently explaining the basics of the format to me. Also to the members of the IETF CELLAR working group, especially Steve Lhomme, for their feedback.

Author's Address

Sam Whited (editor)