Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lavrresample always performs resampling process, leading to non-bitexact output #4716

Closed
macdavis opened this issue Aug 5, 2017 · 19 comments

Comments

@macdavis
Copy link

macdavis commented Aug 5, 2017

mpv version and platform

mpv git 56742ec
MacOS 10.12.6 (16G29)

Reproduction steps

Problem 1

  1. Play this DTS-HD MA sample with -ao=pcm -no-config -ao-pcm-waveheader=yes -ao-pcm-file=/Volumes/RamDisk/dts_hd_ma_mpv.wav
  2. Convert the sample to FLAC with FFmpeg (ffmpeg -i 16_48_2.0.dtshd -acodec flac dts_ha_ma_ffmpeg.flac)
  3. Compare WAV (generated by mpv) with FLAC (generated by FFmpeg) in Adobe Audition

Problem 2
Play this 8 channels FLAC sample with -ao=pcm -no-config -ao-pcm-waveheader=yes -audio-channels=stereo -ao-pcm-file=/Volumes/RamDisk/stereo_downmix_mpv.wav.

Expected behavior

Problem 1
In Adobe Audition, Amplitude Statistics should be identical.

Problem 2
The original FLAC is 24-bit. mpv should also produce 24-bit WAV.

Actual behavior

Problem 1
Amplitude Statistics are different.

WAV (mpv)

Left	Right
Peak Amplitude:	0.00 dB	0.00 dB
True Peak Amplitude:	0.75 dBTP	0.37 dBTP
Maximum Sample Value:	32767	32767
Minimum Sample Value:	-32768	-32768
Possibly Clipped Samples:	471	2767
Total RMS Amplitude:	-11.66 dB	-10.73 dB
Maximum RMS Amplitude:	-5.74 dB	-4.15 dB
Minimum RMS Amplitude:	-57.47 dB	-71.95 dB
Average RMS Amplitude:	-15.03 dB	-16.66 dB
DC Offset:	-1.05 %	-0.80 %
Measured Bit Depth:	16	16
Dynamic Range:	51.74 dB	67.80 dB
Dynamic Range Used:	50.90 dB	63.55 dB
Loudness (Legacy):	-11.05 dB	-6.16 dB
Perceived Loudness (Legacy):	-5.13 dB	-3.98 dB
ITU-R BS.1770-3 Loudness: -8.21 LUFS

FLAC (FFmpeg)

Channel 1	Channel 2
Peak Amplitude:	0.00 dB	0.00 dB
True Peak Amplitude:	0.73 dBTP	0.69 dBTP
Maximum Sample Value:	32767	32767
Minimum Sample Value:	-32768	-32768
Possibly Clipped Samples:	806	4520
Total RMS Amplitude:	-11.67 dB	-10.74 dB
Maximum RMS Amplitude:	-5.71 dB	-4.14 dB
Minimum RMS Amplitude:	-57.61 dB	-70.44 dB
Average RMS Amplitude:	-15.10 dB	-16.72 dB
DC Offset:	-1.05 %	-0.80 %
Measured Bit Depth:	16	16
Dynamic Range:	51.90 dB	66.30 dB
Dynamic Range Used:	51.10 dB	62.70 dB
Loudness (Legacy):	-10.57 dB	-6.20 dB
Perceived Loudness (Legacy):	-5.14 dB	-4.03 dB
ITU-R BS.1770-3 Loudness: -8.33 LUFS

Problem 2
The original FLAC is 24-bit, but mpv produces a 32-bit WAV.

Log file

Problem 1
Log file: http://sprunge.us/CBDC

Problem 2
Log: http://sprunge.us/VLER

Sample files

Problem 1
DTS-HD MA sample

Problem 2
8 channels FLAC sample

@kevmitch
Copy link
Member

kevmitch commented Aug 5, 2017

There appear to be two problems:

  1. mpv cuts off the first 1024 samples of the dtshd sample
  2. even after offsetting for this, the signal is still not lossless

The output of --ao=lavc is identical to that of --ao=pcm, so it doesn't look like the problem is in writing the file.

@macdavis
Copy link
Author

macdavis commented Aug 6, 2017

@kevmitch

Thanks for testing. Could you please also try this sample (24/96/7.1/<DTS_DELAY> = 0)? Perhaps the cutting off is due to the file itself. There is a <DTS_DELAY> : 1024 metadata in the original sample. The sample uploaded here should be no delay (Foobar 2000 shows <DTS_DELAY> : 0) Playing this 24-bit DTS will result in a 32-bit WAV (instead of 24-bit).

Also tried TrueHD, no problem. (The Amplitude Statistics shows WAV is 24-bit and bitexact)

@kkkrackpot
Copy link
Contributor

Tried the second sample on Lin

mpv 0.26.0-95-ga680c643e-dirty (C) 2000-2017 mpv/MPlayer/mplayer2 projects
 built on Sat Aug  5 22:24:53 MSK 2017
ffmpeg library versions:
   libavutil       55.68.100
   libavcodec      57.102.100
   libavformat     57.76.100
   libswscale      4.7.101
   libavfilter     6.95.100
   libswresample   2.8.100
ffmpeg version: N-86848-g03a9e6ff30

mpv's WAV https://0x0.st/dHD.wav
ffmpeg's FLAC https://0x0.st/dHk.flac
Maybe it can help somehow.
Unfortunately, I don't have any Adobe Audition or similar...

@kevmitch
Copy link
Member

kevmitch commented Aug 6, 2017

Yeah, the 24/96t7.1 is now correctly aligned, but the samples still differ. Looking at mpv's audio filter chain, I see that lavresample is inserted in order to convert from planar (which comes out of ffmpeg's decoder) to interleaved for output. There's no reason why that shouldn't be lossless, but maybe something strange is going on.

I guess I'll have to try dumping the raw data at various stages in the code to see where it's getting altered.

@macdavis
Copy link
Author

macdavis commented Aug 6, 2017

@kevmitch It's quite possible that mpv's planar to interleaved conversion is not lossless.
Here is what happens

I converted this FLAC sample to NUT (pcm_s16le_planar) and NUT (pcm_s16le). Playing pcm_s16le (Log: http://sprunge.us/ACXc) is bitexact while playing pcm_s16le_planar (Log: http://sprunge.us/XJIX) is not.

Update:
Workaround: Using af=lavfi="aformat=sample_fmts=s16|s32:channel_layouts=stereo"

@kevmitch
Copy link
Member

kevmitch commented Aug 6, 2017

If I change the lavresample option cutoff, the output changes. This suggests that the signal is needlessly getting resampled to the same rate. I'll have to look in the ffmpeg code to see how they manage to deplanarize without resampling.

@macdavis macdavis changed the title DTS-HD MA decoding is not bitexact (different from FFmpeg) Lavrresample always performs resampling process, leading to non-bitexact output Aug 7, 2017
@kkkrackpot
Copy link
Contributor

Audiophiles will kill someone for it...

@macdavis Does your workaround fix the issue? Does it work with multichannel too? In my system I sent multichannel PCM to a soundbar that seems to accept (almost) all formats.

@macdavis
Copy link
Author

@fhlfibh
You can first check mpv's log. If there is no Lavrresample inserted anywhere, you don't need this workaround because mpv's output is still bitexact.
Yes, the workaround fixes this issue and works with multichannel as well. If your hardware only accept interleaved format, just use af=lavfi="aformat=sample_fmts=s16|s32" for multichannel. In my case, my hardware only accept stereo and interleaved format, I need both format conversion and downmixing done by libavfilter to bypass mpv's internal conversion.

@roberth1990
Copy link

@macdavis

Could you post your libavfilter configuration?

@macdavis
Copy link
Author

@roberth1990
I use af=lavfi="aformat=sample_fmts=s16|s32:channel_layouts=stereo" . Note that during down mixing, libavfilter adds additional headrooms by default to prevent clipping, thus it will sound quieter.

@roberth1990
Copy link

@macdavis
Thank you! That setup work much better than any other setups I have tried to avoid distortion/clipping on some difficult audio tracks without loosing too much dynamic range.

@kkkrackpot
Copy link
Contributor

@macdavis Tried your workaround, but it looks like it just added more mess http://sprunge.us/KSeW
It seems on my system it always inserts lavrresample, for one reason or another...
Anyway, it's better to have lavrresampleitself working correctly.

@ghost
Copy link

ghost commented Aug 11, 2017

I'm not sure what's going on here (and I didn't read most of the issue), but mpv code in af_lavrresample clamps float values to range. This is for making sure non-normalized downmixing does not output out of range values, which in turn could lead to unpredictable behavior in AOs. On the pother hand it sounds like floats are not involved?

Also, I guess avresample_set_compensation() might force reinit to resampling unnecessarily.

@macdavis
Copy link
Author

@fhlfibh
The log says the surround configuration of source audio is side left and side right, but that of your hardware is back left and right. That's why lavrresample is inserted ( Remix: 5.1(side) -> 5.1 Fudge: sl-sr -> bl-br)

Try af=lavfi="aformat=sample_fmts=s16|s32:channel_layouts=5.1". That may solve your problem.

@macdavis
Copy link
Author

@wm4

Also, I guess avresample_set_compensation() might force reinit to resampling unnecessarily.

Yeah, that's the issue I wanna report. Unnecessary resampling process deteriorates the sound quality. Downmixing and planar to interleaved conversion shouldn't have triggered resampling.

On the pother hand it sounds like floats are not involved?

No, it's not about floating point issue. I didn't test floating point.

@ghost
Copy link

ghost commented Aug 12, 2017

I would have expected that swr_set_compensation() (what avresample_set_compensation is defined to) does not enable resampling when it's not needed. But I guess swr doesn't agree.

@ghost ghost closed this as completed in baead23 Aug 12, 2017
@kevmitch
Copy link
Member

Thanks @wm4 that is now bit-exact for --ao=pcm.

@macdavis I see you've altered the original issue to talk about getting s32. You should probably have opened a separate issue for this as significant editing of posts is generally frowned upon since people receive only the initial post via email.

In any case, this is expected since neither ffmpeg nor mpv has has an internal representation for packed s24. Instead, s32 is used with least significant bits set to 0. Unfortunately, there is currently no way for mpv to differentiate between true s32 and s24 in s32, so --ao=pcm just outputs the samples exactly as they're stored. This is still lossless.

@roberth1990 what you want is --audio-normalize-downmix=yes. This currently defaults to no in mpv, because people constantly complained that yes was quieter than VLC.

@macdavis
Copy link
Author

@kevmitch Thanks for your detailed explanation and sorry for the confusion. Next time, I will open a separate issue instead.

@macdavis
Copy link
Author

macdavis commented Jun 8, 2019

@kevmitch I am a bit confused about the alignment on MacOS.

Core Audio defines in the unpacked case, the 24 bits are aligned high within the 4 byte field so that a parser can treat the value as if it were 32 bit integer with the lowest (or least significant) 8 bits all zero). On disk, the little-endian version of this data format looks like this:
00 LL XX MM
where MM is the most significant byte and LL is the least significant.
A big-endian version of 24-bit PCM audio in 4 bytes looks like this:
MM XX LL 00

On MacOS, 24 bits aligned high format matches mpv's s32 format. However, my DAC's format (Also AO format) is 24 Bit Signed Integer Aligned Low in 32 Bit. For my DAC, are the least significant 8 bits or the most significant 8 bits zeroes? I am worried about inconsistent high/low alignment between mpv and my DAC, meaning discarding 8 bits that contain valid information during truncation, instead of discarding zeroes.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants