Fix Broken Mkv Samples Created With Mkvmerge

Re-Creating a truncated mkv Sample when it was created with the MKVmerge writing library.

Intro

I had a good mkv Sample that was fine (passed srs). I split it into 2 pieces: The 1st part was 20M,
to use as a test of a truncated MKV file. The 2nd part was ignored. With this, I could test out
re-creation, and compare the final results to good one.

Find tools

First, you have to use the exact same version of MKVmerge as was used to create the Sample. Any version
of MKVinfo from any release can tell you which one you need. In this case it was 2.4.0.
MediaInfo shows the Writing application too.

All the versions are available here:
http://www.bunkus.org/videotools/mkvtoolnix/win32/

Some of the win32 versions are Windows Installer types (yuck). It also adds a perm path entry pointing
to it for all cmd prompts (yuck again). Install it, then copy it somewhere else, then use Control Panel/
Add-Remove Applications to del it. All the dll's that are needed are in the release folder, so no
conflict occurs when there are different versions. I made a folder on a work drive, and made subfolders
in this for each of the versions (2.4.0 … 4.2.2 .. etc.) and put them there. Each version is 5-7 mb.

  • Use 7-Zip Open Inside or Extract files… on the Nullsoft Install System installer to view and extract everything without running the setup.

Find location

The next thing to discuss is how MKVmerge handles splitting a file. Unlike it's AVI counterpart in Nandub,
VirtualDub and VirtualDubMod, MKVmerge works on "keyframe only segments", so any time entered is rounded
UP to next keyframe (makes it easy to recreate a sample). This is also the behavior in MEncoder. This means
that you only need to find the approximate place in the movie where the sample was cut from: MKVmerge will
use the first NEXT keyframe as the starting location.

So, load up your truncated sample into your player, and play it, noting what is going on. The more you
have of the sample that isn't truncated, the easier it is to find it in the actual full movie. I use BSPlayer,
and the total sample time (it should be) is displayed (NOTE: so do VLC and MPC).

Now load up the movie, and try to find the same segment you just viewed. I use BSPlayer, and in it, the
left / right arrow keys will automatically jump to the next/previous keyframe, so it makes finding the
sample spot in the full movie a lot easier. Once you have done this, note the time in mins/secs where it
at (good idea to write these down in a txt file or similar).

My test case was:
Dragon.Ball.Z.Broly.The.Legendary.Super.Saiyan.1993.720p.BluRay.x264-CiNEFiLE
The truncated sample (20m) was named bad.mkv, and the full movie was extracted from rars and named inp.mkv
(short names used to speed up typing).

It's important to know that not only is srs.exe a really good test for avi and mkv files (it can detect
errors that a lot of apps miss, including GSpot), it can also tell the byte count the sample mkv should be.
In this test case, running srs.exe bad.mkv produced:

Warning: File size does not appear to be correct!
         Expected: 53,110,586
         Found   : 20,000,000

So you know that your sample's size should be the first figure shown (for this test, it's 53,110,586 bytes).

Split mkv

I determined that the sample was 1 minute 45 seconds in length (BSPlayer, VLC and MPC will show you the full
time it is SUPPOSED to be, even when it's truncated), and was to be found in the full movie at approximately
8 minutes 14 seconds. The following was chosen as the cmdline to run:

mkvmerge -o out.mkv --split timecodes:00:08:12,00:09:57 inp.mkv

Since it will auto seek to the next keyframe, I made the start time 08:12 and the end time 09:57 (8:12+01:45).

The MKVmerge parameter "—split timecodes" split the file into 3 pieces and named them:

  • out-001.mkv ……….. Up to the start time (+seek to next keyframe) …….. ignore this file
  • out-002.mkv ……….. This is the sample cut you want
  • out-002.mkv ……….. the rest of the file till the end …….. ignore this file

(NOTE: whatever name is specified for the -o parameter has -001, -002 and -003 appended).

When I ran it, and did a dir list, this was the result:

51,768,122 out-002.mkv

which is too small based on what srs reported to me earlier. This is because it rounded up to the NEXT keyframe
and did a 1 minute 45 second cut from the original time specified (08.12), which makes the result LESS than
1 minute 45 seconds.

I then ran it again with this cmd line:

mkvmerge -o out.mkv --split timecodes:00:08:12,00:09:59 inp.mkv

Which produced:

53,110,586 out-002.mkv

Which is the correct size as reported by srs.exe on the truncated sample.

Fix differences

There is a sequence of bytes that differ at the beginning of the newly created mkv and the truncated sample.
This is always true. due to a date time code being written into the mkv (mkvinfo will show it as:)

ORIG TRUNCATED ..... | + Date: Sat Oct 25 13:29:04 2008 UTC
  NEW RECREATED ...... | + Date: Thu Sep 08 21:03:02 2011 UTC

When I did a binary comparison of the truncated sample and the same number of bytes from newly created sample ,
here were the differences (23 bytes total):

Comparing files bad.avi and OUT-002(20M).MKV
      000010B1: 03 04 000010B2: 6C AD 000010B3: 38 FF 000010B4: 72 0D 000010B5: 7E D5 000010B6: 9A 3F
      000010B7: 40 2C 000010BC: BD 9B 000010BD: 64 B3 000010BE: 3F AF 000010BF: B6 5B 000010C0: 10 2A
      000010C1: B4 64 000010C2: 76 F7 000010C3: AE E9 000010C4: 80 A2 000010C5: 62 68 000010C6: AD AC
      000010C7: 49 32 000010C8: CD CC 000010C9: 45 1E 000010CA: 75 A7 000010CB: 84 AC

Using a technique I have long used for other projects, I split the truncated sample "bad.avi" into 2 parts:
5M and the rest of the the bytes > 5M. The first slice 5M file was named "orig" (the other file ignored).
Then I did the same for the newly created file "out-002.mkv", except this time the 1st 5M was ignored, and
the rest of that file's data slice was named "newslice". Then using a cmd line binary copy …

copy /b orig+newslice try.mkv

This created a file named try.mkv which was:

  • the first 5M of the original truncated sample and
  • the rest of the data > 5M of the newly create sample.

NOTE: You will need some kind of file splitter to do this, or a hex editor. Google it!

This time, When I did a binary comparison of the truncated sample and the same number of bytes from the
"joined" mkv (try.mkv), there were no differences.

Test

As a final test, since I did have the actual original sample, I compared it to my final result:

Comparing files dragon.ball.z.broly.1993.720p.bluray.x264.sample-cinefile.mkv and try.mkv
        FC: no differences encountered

Epilogue

The final word: I believe it is possible to fully recreate a truncated mkv sample from the original source
mkv if it was created by mkvmerge, as long as you use the same mkvmerge version, and the same time and size
parameters. This also applies to a mkv sample that is corrupted (bad bytes), as long as the corruption
does not occur very early in the mkv.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License