Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apple Music: Fetch original file format #80

Closed
ROpdebee opened this issue Oct 16, 2021 · 6 comments · Fixed by #142 or #327
Closed

Apple Music: Fetch original file format #80

ROpdebee opened this issue Oct 16, 2021 · 6 comments · Fixed by #142 or #327
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed mb_enhanced_cover_art_uploads

Comments

@ROpdebee
Copy link
Owner

IMU maximises all Apple Music/iTunes covers to a PNG version. However, the source image might not be a PNG. We should add an exception to the maximisation so that we don't unnecessarily consume more bandwidth and storage for a "lossless" version of a lossy source.

For Apple Music, we should be able to determine whether the source image is a JPEG through its URL, see https://beta.musicbrainz.org/edit/81901201. However, old iTunes-style cover art URLs don't contain that filename. The only way to still get those links, AFAIK, is through the old iTunes API. However, a-tisket still uses this old iTunes API, so seeding covers from there will always lead to a PNG unless we skip maximising altogether, which doesn't seem like the right solution. I also doubt we can determine the correct Apple Music URL from an iTunes cover URL. Instead, we should let the a-tisket seeder seed the Apple Music/iTunes release URL which can get processed by the Apple Music provider to determine the Apple Music cover art URL, which then has the original filename through which we can identify the original file format.

All of that is possible, but we should be very careful with cached a-tisket pages. If we seed the release URL itself, the provider will grab the current cover art, even though the cached page may display old cover art which was replaced. So for cached pages, we should fall back to seeding the original cover art URL.

For any direct iTunes cover art links where we cannot or should not use the Apple Music release page to find the Apple Music cover art URL, we should maybe use JPEG images by default instead of PNG.

@ROpdebee ROpdebee added enhancement New feature or request mb_enhanced_cover_art_uploads help wanted Extra attention is needed labels Oct 16, 2021
@ROpdebee
Copy link
Owner Author

Actually, on closer inspection, I think we should still use PNG. The visual differences are negligible, but they're still there, apparently: qsniyg/maxurl#393.

Taking this image, for example, of which the source image is supposedly JPEG according to the file name, here's the diff between the PNG and the JPEG. It's mostly minute color differences, but JPEG compression at quality level 100 still reduces color information, so I'm guessing the JPEG image we get from Apple Music is recompressed from an already compressed image, leading to rounding errors and slight loss of color information. The PNG is (should be) lossless, so it is (should be) identical to the source image. As long as we cannot get the actual original image from Apple Music, we should use the PNG to make sure we get at least an accurate representation of the original.

I'm no expert on image encoding at all, so I'll leave this open for now. If someone can chime in with additional information, that'd be great.

@david-russo
Copy link

david-russo commented Oct 18, 2021

You pretty much got it all correct, although at level '100' and above (e.g. 99999x0w-100.jpg) Apple's thumbnail generator doesn't subsample the source image's colour/chrominance information, but stores it at full resolution (i.e. 4:4:4 — which may well be higher than the source JPEG, since subsampling is very common, if not standard), so the differences between 99999x0w-100.jpg and 99999x0w-100.png should just be due to rounding.

As you noted, the differences are extremely subtle and, even for someone who knows what to look for, are only noticeable upon very close inspection (and generally with the aid of image manipulation software). If preserving that practically imperceptible difference is considered worth the considerable increase in resources (storage, backup, network) that across-the-board lossless compression entails, then PNG is the way to go.

Personally, I'm a fan of lossless compression if it's lossless from the source, but if we're talking about an image that may have been a 14-MB PNG at one point but was exported and given to Apple as a 6-MB JPEG to use as their source — with the 8 MBs of permanent information loss which that entails — then I don't see a lot of benefit in redoubling its resource footprint just to preserve the previous rounding errors.

There is certainly something to be said here about generational loss (where rounding errors can compound over multiple lossy generations as one gets rounding errors of rounding errors), but if we're gathering these images to permanently store in an archive to then be the single source for any future image generations/derivatives, then the question is still just 'Is this one extra generation of (arguably imperceptible) rounding errors acceptable?'

@ROpdebee
Copy link
Owner Author

Many thanks for the detailed write-up!

Is this one extra generation of (arguably imperceptible) rounding errors acceptable?

Personally, I'm not sure. The extreme digital archivist in me says "unacceptable", but the realist says that the costs vastly outweigh the benefits. I've seen plenty of huge PNGs served by Apple, some going up to 30MB, uploading those takes quite a while even on a decent-ish connection, and takes up a lot of unnecessary storage space at the Internet Archive/Cover Art Archive. If it was a question of "original or smaller with imperceptible differences" I'd probably prefer original, but since we don't have access to the original (as far as I know), using the smaller JPEG seems like the better option if the source is also a JPEG.

@david-russo
Copy link

david-russo commented Oct 18, 2021

Certainly, if we could get the original and avoid an extra generation entirely (lossy or lossless) that would be the ideal from a quality and storage perspective. I'm super sympathetic to Extreme Digital Archiving, but the cost of the diminished returns in this case seem unjustifiably high. (Yet this doesn't stop my inner archivist from also trying to make me feel guilty about it.)

@ROpdebee ROpdebee self-assigned this Oct 20, 2021
ROpdebee added a commit that referenced this issue Oct 20, 2021
When the source image on Apple Music is JPEG, return the maximised
image as JPEG. If it's not JPEG, always use PNG. See #80 for
discussion.
ROpdebee added a commit that referenced this issue Oct 20, 2021
When the source image on Apple Music is JPEG, return the maximised
image as JPEG. If it's not JPEG, always use PNG. See #80 for
discussion.
ROpdebee added a commit that referenced this issue Oct 21, 2021
When the source image on Apple Music is JPEG, return the maximised
image as JPEG. If it's not JPEG, always use PNG. See #80 for
discussion.
ROpdebee added a commit that referenced this issue Oct 21, 2021
When the source image on Apple Music is JPEG, return the maximised
image as JPEG. If it's not JPEG, always use PNG. See #80 for
discussion.
@kellnerd
Copy link
Collaborator

qsniyg/maxurl#962

Apparently someone has found a way to get the Apple source images, I checked a few albums with https://artwork.themoshcrypt.net/ and the source JPEGs (HD Source) were smaller than the blown-up 99999x0w-100.jpg JPEGs which we are currently using. Probably these are indeed the untouched original images.

@ROpdebee
Copy link
Owner Author

ROpdebee commented Dec 22, 2021

URL transformation
Looking at the EXIF data of some of these supposedly original images, I'd say they are indeed originals. Some even contain historical data (timestamps etc) inserted by Photoshop.

It also serves identical images for iTunes API URLs and Apple Music URLs, e.g.:
https://a1.mzstatic.com/us/r1000/063/Music126/v4/2d/8b/e7/2d8be7fe-2bd9-ae50-49a9-b3845c37c12b/source (from iTunes API through a-tisket)
https://a1.mzstatic.com/us/r1000/063/Music126/v4/48/4f/49/484f49a5-fb52-37b3-f3c6-244e20f74b7c/5052075509815.png (from Apple Music through ECAU)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed mb_enhanced_cover_art_uploads
Projects
None yet
3 participants