Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade makemhr workflow to build and upload utils #1026

Merged
merged 11 commits into from
Aug 8, 2024
Merged

Conversation

ThreeDeeJay
Copy link
Contributor

  • Includes makemhr, sofa-info, alrecord, altonegen, openal-info and allafplay Windows executables in utils.zip
  • Uses my libmysofa fork that fixed the Invalid format error by removing a filesize/density limit, so that makemhr is able to convert larger SOFA files (like EAC HRTFs with 5° elevations/azimuths)
    v1.3.2 (latest upstream)
    image
    My fork (just a two-liner so let me know if you'd rather fork and remove limit yourself or if I should revert to the limited official branch)
    image

I've had this branch for a while and was reminded of it here and then I recently saw 3b10c6f and wanted to try it out so I figured it'd be a good time to share this new workflow. Which by the way, is playback supposed to work or is it still a WIP? When I try the LAF files I posted here, I just get a channel breakdown and it immediately exits the process without playing any audio 🤔
image

- Includes makemhr, sofa-info, alrecord, altonegen, openal-info and allafplay Windows executables
- Uses my libmysofa fork that removes a filesize/density limit, so that makemhr is able to convert larger SOFA files
@kcat
Copy link
Owner

kcat commented Aug 4, 2024

  • Includes makemhr, sofa-info, alrecord, altonegen, openal-info and allafplay Windows executables in utils.zip

    • Uses my libmysofa fork that fixed the Invalid format error by removing a filesize/density limit, so that makemhr is able to convert larger SOFA files (like EAC HRTFs with 5° elevations/azimuths)
      v1.3.2 (latest upstream)
      ...
      My fork (just a two-liner so let me know if you'd rather fork and remove limit yourself or if I should revert to the limited official branch)
      ...

It's understandable to want to use a version that doesn't arbitrarily limit the supported size, but that particular change seems questionable. It's removing checks for the sizes being negative, and with larger values, may need overflow checks in various other places. Improperly constructed SOFA files may be erroneously detected as valid, and cause misbehavior without overflow checks.

Which by the way, is playback supposed to work or is it still a WIP? When I try the LAF files I posted here, I just get a channel breakdown and it immediately exits the process without playing any audio 🤔

It should work, yes. Odd that it seems to fail reading the channel header data (it's all 0s, when the E values should be -0 or -nan), despite correctly reading the header markers and the quality, mode, and track count values, which in turn causes it to fail to read the sample rate and sample count, aborting playback. I don't see why, the file is opened in binary mode and those files play for me; if there was a problem reading binary/non-text data through std::ifstream on Windows, I'd have expected a similar problem with OpenAL Soft itself loading mhr files, which it obviously does without issue.

Unrelated, but it also seems like my trick of calling std::terminate in a catch(...) block doesn't seem to actually report anything on Windows? No dialog box about an abnormal termination, nor a message about why it terminated?

@kcat kcat mentioned this pull request Aug 4, 2024
@ThreeDeeJay
Copy link
Contributor Author

ThreeDeeJay commented Aug 5, 2024

It's understandable to want to use a version that doesn't arbitrarily limit the supported size, but that particular change seems questionable. It's removing checks for the sizes being negative, and with larger values, may need overflow checks in various other places. Improperly constructed SOFA files may be erroneously detected as valid, and cause misbehavior without overflow checks.

Makes sense. I reverted it to the official repo though I updated it to v1.3.2 which has a slightly more generous limit.
So what if we just keep the negative size check, but expand the limit to around 4GB?
For reference a 1 degree azimuth/elevation EAC HRTF is >2GB so 96Khz might be double that, and I've seen other SOFAs that are GBs in size.
explorer_LceHBLufWm

It should work, yes. Odd that it seems to fail reading the channel header data (it's all 0s, when the E values should be -0 or -nan), despite correctly reading the header markers and the quality, mode, and track count values, which in turn causes it to fail to read the sample rate and sample count, aborting playback. I don't see why, the file is opened in binary mode and those files play for me; if there was a problem reading binary/non-text data through std::ifstream on Windows, I'd have expected a similar problem with OpenAL Soft itself loading mhr files, which it obviously does without issue.

Looks like 7a7350a did the trick because this time I did get audio playback 👍
image
Dolby's Universe Fury demo had never sounded this good to me with my HRTF, except, that there are really loud static noises at seemingly random times, which I just noticed also happen in Cavern when playing the LAF, but when playing the original file directly, I instead get a suppressed pop/clip. So there might be a conversion bug or the Cavern decoder is yet to address some data getting corrupted 🤔

EDIT: Welp, apparently the static noise is a known issue with DD+ JOC VoidXH/Cavern#17
But on the bright side, playing the LAF converted from ADM sounds pretty much flawless

Log
C:\Programs\Cavern\GUI>allafplay.exe "Nature's Fury.laf"
Opened "OpenAL Soft on Speakers (SB X-Fi Surround 5.1)"
Filename: "Nature's Fury.laf"
quality: 16-bit int
mode: objects
track count: 126
Track 0: E=-0, A=0 (LFE: 0)
Track 1: E=-0, A=0 (LFE: 0)
Track 2: E=-0, A=0 (LFE: 0)
Track 3: E=-0, A=0 (LFE: 1)
Track 4: E=-0, A=0 (LFE: 0)
Track 5: E=-0, A=0 (LFE: 0)
Track 6: E=-0, A=0 (LFE: 0)
Track 7: E=-0, A=0 (LFE: 0)
Track 8: E=-0, A=0 (LFE: 0)
Track 9: E=-0, A=0 (LFE: 0)
Track 10: E=-0, A=0 (LFE: 0)
Track 11: E=-0, A=0 (LFE: 0)
Track 12: E=-0, A=0 (LFE: 0)
Track 13: E=-0, A=0 (LFE: 0)
Track 14: E=-0, A=0 (LFE: 0)
Track 15: E=-0, A=0 (LFE: 0)
Track 16: E=-0, A=0 (LFE: 0)
Track 17: E=-0, A=0 (LFE: 0)
Track 18: E=-0, A=0 (LFE: 0)
Track 19: E=-0, A=0 (LFE: 0)
Track 20: E=-0, A=0 (LFE: 0)
Track 21: E=-0, A=0 (LFE: 0)
Track 22: E=-0, A=0 (LFE: 0)
Track 23: E=-0, A=0 (LFE: 0)
Track 24: E=-0, A=0 (LFE: 0)
Track 25: E=-0, A=0 (LFE: 0)
Track 26: E=-0, A=0 (LFE: 0)
Track 27: E=-0, A=0 (LFE: 0)
Track 28: E=-0, A=0 (LFE: 0)
Track 29: E=-0, A=0 (LFE: 0)
Track 30: E=-0, A=0 (LFE: 0)
Track 31: E=-0, A=0 (LFE: 0)
Track 32: E=-0, A=0 (LFE: 0)
Track 33: E=-0, A=0 (LFE: 0)
Track 34: E=-0, A=0 (LFE: 0)
Track 35: E=-0, A=0 (LFE: 0)
Track 36: E=-0, A=0 (LFE: 0)
Track 37: E=-0, A=0 (LFE: 0)
Track 38: E=-0, A=0 (LFE: 0)
Track 39: E=-0, A=0 (LFE: 0)
Track 40: E=-0, A=0 (LFE: 0)
Track 41: E=-0, A=0 (LFE: 0)
Track 42: E=-0, A=0 (LFE: 0)
Track 43: E=-0, A=0 (LFE: 0)
Track 44: E=-0, A=0 (LFE: 0)
Track 45: E=-0, A=0 (LFE: 0)
Track 46: E=-0, A=0 (LFE: 0)
Track 47: E=-0, A=0 (LFE: 0)
Track 48: E=-0, A=0 (LFE: 0)
Track 49: E=-0, A=0 (LFE: 0)
Track 50: E=-0, A=0 (LFE: 0)
Track 51: E=-0, A=0 (LFE: 0)
Track 52: E=-0, A=0 (LFE: 0)
Track 53: E=-0, A=0 (LFE: 0)
Track 54: E=-0, A=0 (LFE: 0)
Track 55: E=-0, A=0 (LFE: 0)
Track 56: E=-0, A=0 (LFE: 0)
Track 57: E=-0, A=0 (LFE: 0)
Track 58: E=-0, A=0 (LFE: 0)
Track 59: E=-0, A=0 (LFE: 0)
Track 60: E=-0, A=0 (LFE: 0)
Track 61: E=-0, A=0 (LFE: 0)
Track 62: E=-0, A=0 (LFE: 0)
Track 63: E=-0, A=0 (LFE: 0)
Track 64: E=-0, A=0 (LFE: 0)
Track 65: E=-0, A=0 (LFE: 0)
Track 66: E=-0, A=0 (LFE: 0)
Track 67: E=-0, A=0 (LFE: 0)
Track 68: E=-0, A=0 (LFE: 0)
Track 69: E=-0, A=0 (LFE: 0)
Track 70: E=-0, A=0 (LFE: 0)
Track 71: E=-0, A=0 (LFE: 0)
Track 72: E=-0, A=0 (LFE: 0)
Track 73: E=-0, A=0 (LFE: 0)
Track 74: E=-0, A=0 (LFE: 0)
Track 75: E=-0, A=0 (LFE: 0)
Track 76: E=-0, A=0 (LFE: 0)
Track 77: E=-0, A=0 (LFE: 0)
Track 78: E=-0, A=0 (LFE: 0)
Track 79: E=-0, A=0 (LFE: 0)
Track 80: E=-0, A=0 (LFE: 0)
Track 81: E=-0, A=0 (LFE: 0)
Track 82: E=-0, A=0 (LFE: 0)
Track 83: E=-0, A=0 (LFE: 0)
Track 84: E=-0, A=0 (LFE: 0)
Track 85: E=-0, A=0 (LFE: 0)
Track 86: E=-0, A=0 (LFE: 0)
Track 87: E=-0, A=0 (LFE: 0)
Track 88: E=-0, A=0 (LFE: 0)
Track 89: E=-0, A=0 (LFE: 0)
Track 90: E=-0, A=0 (LFE: 0)
Track 91: E=-0, A=0 (LFE: 0)
Track 92: E=-0, A=0 (LFE: 0)
Track 93: E=-0, A=0 (LFE: 0)
Track 94: E=-0, A=0 (LFE: 0)
Track 95: E=-0, A=0 (LFE: 0)
Track 96: E=-0, A=0 (LFE: 0)
Track 97: E=-0, A=0 (LFE: 0)
Track 98: E=-0, A=0 (LFE: 0)
Track 99: E=-0, A=0 (LFE: 0)
Track 100: E=-0, A=0 (LFE: 0)
Track 101: E=-0, A=0 (LFE: 0)
Track 102: E=-0, A=0 (LFE: 0)
Track 103: E=-0, A=0 (LFE: 0)
Track 104: E=-0, A=0 (LFE: 0)
Track 105: E=-0, A=0 (LFE: 0)
Track 106: E=-0, A=0 (LFE: 0)
Track 107: E=-0, A=0 (LFE: 0)
Track 108: E=-0, A=0 (LFE: 0)
Track 109: E=-0, A=0 (LFE: 0)
Track 110: E=-0, A=0 (LFE: 0)
Track 111: E=-0, A=0 (LFE: 0)
Track 112: E=-0, A=0 (LFE: 0)
Track 113: E=-0, A=0 (LFE: 0)
Track 114: E=-0, A=0 (LFE: 0)
Track 115: E=-0, A=0 (LFE: 0)
Track 116: E=-0, A=0 (LFE: 0)
Track 117: E=-0, A=0 (LFE: 0)
Track 118: E=-nan(ind), A=0 (LFE: 0)
Track 119: E=-nan(ind), A=0 (LFE: 0)
Track 120: E=-nan(ind), A=0 (LFE: 0)
Track 121: E=-nan(ind), A=0 (LFE: 0)
Track 122: E=-nan(ind), A=0 (LFE: 0)
Track 123: E=-nan(ind), A=0 (LFE: 0)
Track 124: E=-nan(ind), A=0 (LFE: 0)
Track 125: E=-nan(ind), A=0 (LFE: 0)
Channels: 118
Sample rate: 48000
Length: 5120000 samples (106.667 sec)

C:\Programs\Cavern\GUI>

Unrelated, but it also seems like my trick of calling std::terminate in a catch(...) block doesn't seem to actually report anything on Windows? No dialog box about an abnormal termination, nor a message about why it terminated?

Nope, nada.

@kcat
Copy link
Owner

kcat commented Aug 5, 2024

So what if we just keep the negative size check, but expand the limit to around 4GB? For reference a 1 degree azimuth/elevation EAC HRTF is >2GB so 96Khz might be double that, and I've seen other SOFAs that are GBs in size.

Looks like it may need an overflow check too. Something like

if(elements <= 0 || size <= 0 || INT_MAX/size > elements)
    ... error ...

I don't know how this check in specific relates to the file or dataset size, but at least the result of elements * size can't go far above 2.1 billion, or else it'll overflow and become undefined behavior. Expanding that would need more work to avoid signed ints, where those kinds of changes tend to need farther reaching changes since those variables may interact with other variables that may not be happy when they're different types.

Looks like 7a7350a did the trick because this time I did get audio playback 👍

That's quite odd. It shouldn't have affected reading, basically just replacing terminating asserts with exceptions that skip just the current file.

EDIT: Welp, apparently the static noise is a known issue with DD+ JOC VoidXH/Cavern#17 So I wonder how this person managed to get a clean sound

This demo was converted from DD+ E-AC-3 JOC to LAF with Cavern and then rendered using OpenAL Soft's Default HRTF

I may not know what I'm talking about, or misunderstanding what I thought I read, but IIRC, support for pure DD+ JOC was a problem, while E-AC-3 JOC worked better. Or maybe that was TrueHD with Atmos that didn't work, while DD+/E-AC-3 with Atmos did.

On the bright side, playing the LAF converted from ADM (invite) works beautifully

That's the kind of thing I expected to see from LAF files converted from Atmos. Rather than the other LAF file that had bursts of static, which seems to just be a 7.1.4 mix (that doesn't even mark the LFE channel properly), that one there uses a fixed channel bed (with a properly marked LFE channel) and several dozen dynamic objects. It does still seem to use left-handed coordinates though (+Z is front), which isn't correct according to the LAF documentation.

It does make me think the LAF player should try to be more efficient with source playback, stopping sources that are disabled in chunk headers, and starting them when enabled with more audio. Doing that would make it more efficient when multiple channels are disabled/silent most of the time. It would require the AL_SOFT_source_start_delay extension to make sure sources are restarted at the proper time synchronized with other sources (currently no extensions are required, outside of float32 and int32 for files that use float32 or int24 samples respectively), and if there's any resampling going on, synchronization will be extra tricky to account for sub-sample offsets.

ThreeDeeJay added a commit to ThreeDeeJay/libmysofa that referenced this pull request Aug 8, 2024
@ThreeDeeJay
Copy link
Contributor Author

Looks like it may need an overflow check too.

Done ThreeDeeJay@f3d7487, though the =<5° EAC SOFA still fails to convert with the limit.
Also, the CI now clones a specific commit, for stability reasons, and I added commit date, count and hash to the workflow artifact to avoid duplicate confusion when debugging

I may not know what I'm talking about, or misunderstanding what I thought I read, but IIRC, support for pure DD+ JOC was a problem, while E-AC-3 JOC worked better. Or maybe that was TrueHD with Atmos that didn't work, while DD+/E-AC-3 with Atmos did.

Yeah, what I do know is that TrueHD can't be implemented at the moment.
So I've been looking into the process to at least mix to 9.1.6 so maybe I can convert it to ADM then into LAF, since source ADM tracks are extremely scarce. And apparently coarse JOC is what causes decoding artifacts:

There are two types of JOC coding, coarse and sparse. Coarse, which is widely used by movies, works without issues, sparse (found in some versions of amaze) doesn't, as the documentation is incorrect about it
VoidXH/Cavern#17 (comment)

@kcat
Copy link
Owner

kcat commented Aug 8, 2024

Wait, sorry. That check should be

if(elements <= 0 || size <= 0 || elements > INT_MAX/size)
  return MYSOFA_INVALID_FORMAT;

@ThreeDeeJay
Copy link
Contributor Author

Thanks, that did the trick! 👌
By the way why does it sometimes use less than the total IRs? 🤔
image

@kcat
Copy link
Owner

kcat commented Aug 8, 2024

By the way why does it sometimes use less than the total IRs? 🤔

It ignores IRs that don't align with the detected uniform layout. The IR layout for the mhr file needs to have fixed stepping along the elevation (starting at +90 degrees and stepping down to -90 degrees), and fixed stepping along the azimuth for each valid elevation (starting at 0 degrees, directly in front, and going around a full 360 degrees).

@kcat kcat merged commit 91a4a0d into kcat:master Aug 8, 2024
11 checks passed
@ThreeDeeJay
Copy link
Contributor Author

Nice 👍
By the way, you might wanna delete https://github.com/kcat/openal-soft/releases/tag/makemhr since that release is now obsolete.

@kcat
Copy link
Owner

kcat commented Aug 9, 2024

I may not know what I'm talking about, or misunderstanding what I thought I read, but IIRC, support for pure DD+ JOC was a problem, while E-AC-3 JOC worked better. Or maybe that was TrueHD with Atmos that didn't work, while DD+/E-AC-3 with Atmos did.

Yeah, what I do know is that TrueHD can't be implemented at the moment. So I've been looking into the process to at least mix to 9.1.6 so maybe I can convert it to ADM then into LAF, since source ADM tracks are extremely scarce. And apparently coarse JOC is what causes decoding artifacts:

There are two types of JOC coding, coarse and sparse. Coarse, which is widely used by movies, works without issues, sparse (found in some versions of amaze) doesn't, as the documentation is incorrect about it
VoidXH/Cavern#17 (comment)

I'm curious if ffmpeg's code could help at all. It seems to handle decoding TrueHD tracks just fine, "truehd (Dolby TrueHD + Dolby Atmos)" even. "eac3 (Dolby Digital Plus + Dolby Atmos)" also plays fine, without any glitching. Interestingly, it reports and provides such tracks as 7.1, which seem to have everything in the mix. I can't do a deeper look to see if it's 7.1(.4) premixed or something that it's giving with a simple downmix, or if it has dynamic objects being moved and mixed in real-time by its decoder.

But on the bright side, playing the LAF converted from ADM sounds pretty much flawless

Also interestingly, this LAF doesn't seem too well optimized. Each chunk always has all tracks enabled, when I'm sure many of them are silent for long stretches of time. I also wonder if there's room to optimize track utilization (if two or more tracks don't play sound at the same time, they can be on the same track; you'd only need as many tracks as are played simultaneously). Those would help cut down on the LAF file size pretty well, I think.

@ThreeDeeJay
Copy link
Contributor Author

I'm curious if ffmpeg's code could help at all. It seems to handle decoding TrueHD tracks just fine, "truehd (Dolby TrueHD + Dolby Atmos)" even. "eac3 (Dolby Digital Plus + Dolby Atmos)" also plays fine, without any glitching. Interestingly, it reports and provides such tracks as 7.1, which seem to have everything in the mix. I can't do a deeper look to see if it's 7.1(.4) premixed or something that it's giving with a simple downmix, or if it has dynamic objects being moved and mixed in real-time by its decoder.

Last I checked, ffmpeg could just mux/identify the TrueHD Atmos track and decode the 7.1 "core" surround, but not heights or objects (which I think are encoded into the 7.1 track with metadata), though some figured out how to render up to 9.1.6 using a script or Music Media Helper, both of which rely on Dolby Reference Player/Encoder

According to a user in our server, DDPA was implementable in FOSS since there is an open spec (ETSI EAC object carriage). Decoding THDA is not implementable in FOSS since there is no open spec, unless you get the proprietary software DRP to do the job, which is what the script does.
https://cavern.sbence.hu/cavern/doc.php?p=Cavernize#cd click on "Codec disclaimer"

Also interestingly, this LAF doesn't seem too well optimized. Each chunk always has all tracks enabled, when I'm sure many of them are silent for long stretches of time. I also wonder if there's room to optimize track utilization (if two or more tracks don't play sound at the same time, they can be on the same track; you'd only need as many tracks as are played simultaneously). Those would help cut down on the LAF file size pretty well, I think.

I forwarded it to the dev since tbh I'm not really familiar with the Cavern codebase (I'm just the CI guy there too lol), so feel free to join the conversation here, or via a new issue/PR 👀👌

@VoidXH
Copy link

VoidXH commented Aug 10, 2024

Also interestingly, this LAF doesn't seem too well optimized. Each chunk always has all tracks enabled, when I'm sure many of them are silent for long stretches of time. I also wonder if there's room to optimize track utilization (if two or more tracks don't play sound at the same time, they can be on the same track; you'd only need as many tracks as are played simultaneously). Those would help cut down on the LAF file size pretty well, I think.

Thanks for the implementation! When a LAF is exported by converting an E-AC-3 file, even if noise is present, the track will be active. One track is one object from the source. You could optimize it by clustering or gating the PCM signals, but that wouldn't be lossless.

hoene pushed a commit to hoene/libmysofa that referenced this pull request Sep 17, 2024
* Remove file size limit

* Restore limit with an overflow check

Suggested in kcat/openal-soft#1026 (comment)

* Fixed SOFA limit

* Restore original intendation

* Restore coding style spacing

* Fix "undeclared identifier" error

Error:
"libmysofa/src/hdf/btree.c:269:48: note: each undeclared identifier is reported only once for each function it appears in
269 |   if (elements <= 0 || size <= 0 || elements > INT_MAX/size)
This is probably fixable by adding #include <limits.h>"

* Create build.yml

* Remove workflow belonging to a different branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants