A javascript implementation of the Webm Demuxer (matroska). View a demo of a dumux result here Better examples and reworked api coming soon.
live demo: https://jscodec.github.io/jswebm-demo/ https://github.com/jscodec/jswebm-demo
Top Level wrapper and interface.
videoPackets
: Array of demuxed video packetsaudioPackets
: Array of demuxed audio packetsdocType
: Document typeaudioCodec
: Audio codec for the audio trackeof
: Boolean, if the end of the file has been reached
queueData(buffer)
: queue an incoming chunck of data, must be sequentialdemux()
: Attempts to parse up to 1 new packet, maybe return promise and reject if current buffer runs out
const demuxer = new JsWebm();
demuxer.queueData(buffer);
while (!demuxer.eof) {
demuxer.demux();
}
console.log(demuxer);
console.log(`total video packets : ${demuxer.videoPackets.length}`);
console.log(`total audio packets : ${demuxer.audioPackets.length}`);
{
data: ArrayBuffer(3714) {},
isKeyframe: false,
keyframeTimestamp: 0,
timestamp: 0,
}
Running the demo
npm install
npm run test
- V0.0.3
- Working on ogv.js 1.3.1
- Added basic support for Matroska Files
- Added support for Tags Element
- Added Support for Fixed size, and EBML laced elements
The demuxer holds a queue of arrayBuffers which are sent in from the main player controller. The difficulty lies in the way the buffers come in. In order to achieve progressive downloading, we must parse the data as it comes in, but it is not possible to ensure that the elements will be completely contained in one chunk ie: the elements can be arbitrarily broken up across one ore more incoming buffers.
Main goal : To parse the incoming buffers without unnecessary rewrites. The only write will be the time the final frame buffer is made which will be sent off to the decoders.
queueData(data)
receives arrayBuffer chunks of arbitrary length, adds to queuedemux()
is called from main loop- Parse as much as possible then exit.
- Must pick up parsing where it left off.
- Not possible to know if enough data available to parse.
The matroska format uses the EBML principal, which is essentially a type of markdown language like xml which can be applied to binary files. The elements come in 2 basic types: container types, which contain sub elements called Master Elements, and 7 data type elements. All elements contain a 2 part header, plus their payload. The header contains an id, which can range in length from 1 to 4 bytes, and a size which ranges from 1 to 8 bytes. Vint or variable sized integers, used for the id and size contain the length of their respective integer in the first byte.
The algorithm will then work as follows:
- Read first byte
- Calculate byte width of Vint
- Test if there are enough bytes available in current buffer
- If yes, read entire Vint
- If not, use buffered read method saving state at each position (more overhead)
- At each stage check if there are remaining bytes
- If no, dequeue buffer
- If no more buffers, return null or false (can't decide yet)
- If no, dequeue buffer
- Upon next call to process, must pick up where it left off
Example of Element spread across 2 buffers
Closeup of Vint or Element ID
Coming Soon!