Skip to content

Conversation

@kevinbackhouse
Copy link
Collaborator

Simple but hacky workaround for the Sony preview issue. (Just an idea.)

@codecov
Copy link

codecov bot commented Dec 1, 2021

Codecov Report

Merging #2013 (f592f97) into main (fde8ed0) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2013      +/-   ##
==========================================
+ Coverage   61.16%   61.18%   +0.02%     
==========================================
  Files          96       96              
  Lines       19255    19256       +1     
  Branches     9862     9860       -2     
==========================================
+ Hits        11777    11782       +5     
+ Misses       5135     5134       -1     
+ Partials     2343     2340       -3     
Impacted Files Coverage Δ
src/tiffvisitor_int.cpp 76.25% <100.00%> (+0.02%) ⬆️
src/jpgimage.cpp 70.48% <0.00%> (+0.56%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fde8ed0...f592f97. Read the comment docs.

@clanmills
Copy link
Collaborator

clanmills commented Dec 1, 2021

@kevinbackhouse I think we need to do a couple of things here:

  1. Test your code with the ExifTool Sony Sample Images.
    I've documented where to get the test images here: https://clanmills.com/exiv2/book/#8-6

Can you test your code when the file is rewritten. Does 0x2001 survive when the new file is much longer than the original by adding a huge block of Comment, XMP, IPTC, ICC data? Or much smaller, by deleting Comment, XMP, IPTC, ICC data.

  1. Totally understand Sony tag 0x2001.
    I have a copy of Phil's test images and currently looking at SonyILCE-6600.jpg. At first sniff (with tvisitor), I don't understand tag 0x2001.

I'd like to analyse tag 0x2001 and document it in my book. You'll recall that @dhoulder did very good work on CR3 (BMFF) Native Previews and I updated the book to explain that: https://clanmills.com/exiv2/book/#5-3. I'm offering to something similar for these Sony Previews.

I know that Nikon JPGs store embedded previews in segments that follow the JPEG/EOI of the main image. These are correctly preserved by writeMetadata(), however they are uknown to the Exiv2 Preview Manager. I should document and explain this in the book.

The ascii typeId might not be appropriate as it is intended for 7-bit ascii with a \0 terminator. Exiv2 accepts 8-bit ascii, however I think it respects the null-terminator. What about using typeId undefined (which is intended for binary bytes)?

I suspect your idea to introduce a typeId for previews is really good. The Exiv2 preview manager could search the metadata for previewId without knowing how previews were discovered by readMetadata().

@clanmills
Copy link
Collaborator

clanmills commented Dec 1, 2021

I think Phil has "doctored" his test images to remove the image and previews to save space. All 742 of his Sony Images are 100k or less.

I've googled around to find an unmodified Sony image with previews and found this 7mb image: https://www.magezinepublishing.com/equipment/images/equipment/Alpha-a6500-6282/highres/Sony-Alpha-A6500-Lucy-Portrait-without-flash-DSC09132_1486548974.jpg

Sniffing at it with tvisitor, it has a main image (4000x6000) and two previews:

  1. 120x160 in the Exif Metadata.
  2. 1080x1616 which has a 0x2001 and 0xb02c record in Exif metadata
989 rmills@rmillsmm-local:~/Downloads $ tvisitor -pRU Sony-Alpha-A6500-Lucy-Portrait-without-flash-DSC09132_1486548974.jpg | grep -e 2001 -e b02c -e ' 0xff'
       0 | 0xffd8 SOI  
       2 | 0xffe1 APP1  |   46055 | Exif__II*_.___._..._ ___.___..._.___.___
          1296 | 0xb02c Exif.Sony.0xb02c                 |      LONG |        2 |      2966 | 1080 1616
          1752 | 0x2001 Exif.Sony.0x2001                 | UNDEFINED |        0 |           | 
           0 | 0xffd8 SOI  
           2 | 0xffdb DQT   |     132 | _.......................................
         136 | 0xffc4 DHT   |     418 | __........________............_.........
         556 | 0xffc0 SOF0  |      17 | ._x_...!_........ = h,w = 120,160
         575 | 0xffda SOS  
        7113 | 0xffd9 EOI  
   46059 | 0xffe2 APP2  |     531 | MPF_II*_.___.__.._.___0100..._.___.___..
   46592 | 0xffdb DQT   |     132 | _.......................................
   46726 | 0xffc4 DHT   |     418 | __........________............_.........
   47146 | 0xffc0 SOF0  |      17 | ....p..!_........ = h,w = 4000,6000
   47165 | 0xffda SOS  
 6575565 | 0xffd9 EOI  
 6575616 | 0xffd8 SOI  
 6575618 | 0xffe1 APP1  |     160 | Exif__II*_.___._..._.___>___..._.___F___
 6575780 | 0xffdb DQT   |     132 | _.......................................
 6575914 | 0xffc4 DHT   |     418 | __........________............_.........
 6576334 | 0xffc0 SOF0  |      17 | ..8.P..!_........ = h,w = 1080,1616
 6576353 | 0xffda SOS  
 7111084 | 0xffd9 EOI  
990 rmills@rmillsmm-local:~/Downloads $ ls -l Sony-Alpha-A6500-Lucy-Portrait-without-flash-DSC09132_1486548974.jpg 
-rw-r--r--@ 1 rmills  staff  7143424  1 Dec 20:10 Sony-Alpha-A6500-Lucy-Portrait-without-flash-DSC09132_1486548974.jpg
991 rmills@rmillsmm-local:~/Downloads $ exiv2 -g preview/i Sony-Alpha-A6500-Lucy-Portrait-without-flash-DSC09132_1486548974.jpg 
Exif.Sony1.PreviewImageSize                  Long        2  1080 x 1616
Exif.Sony1.PreviewImage                      Undefined   0  
995 rmills@rmillsmm-local:~/Downloads $ 

Curious that tvisitor and exiv2 agree that 0x2001 is an undefined of count==0. That looks like a flag and not an offset to the preview! This is using the same strategy as Nikon to store previews after the EOI for the main image. JpegBase::writeMetadata() will handle this correctly during a file rewrite.

The 1080x1616 preview is invisible to the Exiv2 preview manager in the same way as Nikon previews.

1003 rmills@rmillsmm-local:~/Downloads $ exiv2 -pp Sony-Alpha-A6500-Lucy-Portrait-without-flash-DSC09132_1486548974.jpg 
Preview 1: image/jpeg, 160x120 pixels, 7115 bytes
1004 rmills@rmillsmm-local:~/Downloads $ $ exiv2 -pp ~/Stonehenge.jpg 
Preview 1: image/jpeg, 160x120 pixels, 10837 bytes
1005 rmills@rmillsmm-local:~/Downloads $ 

@clanmills
Copy link
Collaborator

I've done more work on this today. There are 741 Sony image in the ExifTool test files. 65 use the Exif.Sony1.PreviewImage tag.

1058 rmills@rmillsmm-local:/Users/Shared/Jenkins/Home/userContent/testfiles/ExifTool/Sony $ exiv2 -g Preview * 2>/dev/null | cut -c -100
SonyDSC-HX10V.jpg     Exif.Sony1.PreviewImageSize                  Long        2  1080 x 1440
SonyDSC-HX200V.jpg    Exif.Sony1.PreviewImageSize                  Long        2  1080 x 1440
...
SonyDSC-WX70.jpg      Exif.Sony1.PreviewImageSize                  Long        2  1080 x 1440
SonyDSC-WX80.jpg      Exif.Sony1.PreviewImageSize                  Long        2  1080 x 1440
SonyDSLR-A200.jpg     Exif.Sony1.PreviewImage                      Undefined 695912  0 0 0 0 0 0 0 0
SonyDSLR-A230.jpg     Exif.Sony1.PreviewImage                      Undefined 612044  0 0 0 0 0 0 0 0
..
SonyILCE-6500.jpg     Exif.Sony1.PreviewImage                      Undefined   0  
1059 rmills@rmillsmm-local:/Users/Shared/Jenkins/Home/userContent/testfiles/ExifTool/Sony $

There are two common values for 0x2001

1066 rmills@rmillsmm-local:/Users/Shared/Jenkins/Home/userContent/testfiles/ExifTool/Sony $ tvisitor -pRU SonyILCE-6500.jpg | grep 2001
          1752 | 0x2001 Exif.Sony.0x2001                 | UNDEFINED |        0 |           | 
1067 rmills@rmillsmm-local:/Users/Shared/Jenkins/Home/userContent/testfiles/ExifTool/Sony $ tvisitor -pRU SonyDSLR-A200.jpg | grep 2001
          1098 | 0x2001 Exif.Sony.0x2001                 | UNDEFINED |   695912 |   3342327 | ____________________________________ +++
1068 rmills@rmillsmm-local:/Users/Shared/Jenkins/Home/userContent/testfiles/ExifTool/Sony $ 

The A200 file has been "doctored" (confess @boardhead). The image is 8x8 and the previews have been removed.

1068 rmills@rmillsmm-local:/Users/Shared/Jenkins/Home/userContent/testfiles/ExifTool/Sony $ tvisitor -pS SonyDSLR-A200.jpg 
STRUCTURE OF JPEG FILE (II): SonyDSLR-A200.jpg
 address | marker       |  length | signature
       0 | 0xffd8 SOI  
       2 | 0xffe1 APP1  |   53681 | Exif__II*_.___._..._.___.___..._.___.___
   53685 | 0xffdb DQT   |     132 | _..........2...2.......55555.DAAAAAADDDD
   53819 | 0xffc0 SOF0  |      17 | ._._...._........ = h,w = 8,8
   53838 | 0xffc4 DHT   |      75 | _.._______________...________________.._
   53915 | 0xffda SOS  
   53932 | 0xffd9 EOI  
END: SonyDSLR-A200.jpg
1069 rmills@rmillsmm-local:/Users/Shared/Jenkins/Home/userContent/testfiles/ExifTool/Sony $ 

So, I hunted on the internet for a Sony A200 image and found this: https://www.imaging-resource.com/PRODS/AA200/FULLRES/AA200hMULTI1920.HTM

This file has not been "doctored".

1073 rmills@rmillsmm-local:~/Downloads $ tvisitor -pS AA200hMULTI1920.JPG 
STRUCTURE OF JPEG FILE (II): AA200hMULTI1920.JPG
 address | marker       |  length | signature
       0 | 0xffd8 SOI  
       2 | 0xffe1 APP1  |   53681 | Exif__II*_.___._..._.___.___..._.___.___
   53685 | 0xffdb DQT   |     132 | _.......................................
   53819 | 0xffc4 DHT   |     418 | __........________............_.........
   54239 | 0xffc0 SOF0  |      17 | .._....!_........ = h,w = 1280,1920
   54258 | 0xffda SOS  
  809583 | 0xffd9 EOI  
  819237 | 0xffdb DQT   |     132 | _.......................................
  819371 | 0xffc4 DHT   |     418 | __........________............_.........
  819791 | 0xffc0 SOF0  |      17 | ..8.P..!_........ = h,w = 1080,1616
  819810 | 0xffda SOS  
 1346880 | 0xffd9 EOI  
END: AA200hMULTI1920.JPG
1074 rmills@rmillsmm-local:~/Downloads $ 

The main image is 1280x1920 and there's a 1080x1616 preview following EOI of the main image.

It's has the mysterious 0x2001 image set as follows:

1075 rmills@rmillsmm-local:~/Downloads $ tvisitor -pRU AA200hMULTI1920.JPG | grep 2001
          1098 | 0x2001 Exif.Sony.0x2001                 | UNDEFINED |   527679 |    819191 | ..._P.__8.__________P.__8._______... +++
1076 rmills@rmillsmm-local:~/Downloads $ 

We should ignore both the Count and Offset in this record. Sure, they lead to the trailing preview, however they are an editing nightmare. We should simply treat this (and report it) as type=unidentified count=0. jpegbase::writemetadata() understands the JPEG file and will rewrite the image correctly. There is no reason to allocate anything at all for Exif.Sony1.PreviewImage and not reason to give a warning. The metadata effectively says "by the way there might be a preview" in the trailer of this jpeg.

I'll update my book in the next few days to discuss both this tag and how Nikon (and possibly other) JPEGs use this "trailing preview" which is correctly handled by JpegBase::writeMetadata().

@dhoulder If you'd like an interesting and fun project, how would you like to work on the preview manager and get him to correctly deal with JPEG embedded previews?

The idea of changing the image in the JPEG SOF0/SOS with an 8x8 image is wonderful. We could use that to significantly shrink the size of the test/data directory. Andreas .exv format does that just as effectively. However there are JPEGs in the test suite which are necessary for reasons that can be discussed if anybody wants to investigate how to shrink test/data.

AA200hMULTI1920

@dhoulder
Copy link
Contributor

dhoulder commented Dec 3, 2021

@dhoulder If you'd like an interesting and fun project, how would you like to work on the preview manager and get him to correctly deal with JPEG embedded previews?

Hi Robin. I've managed to get myself a full-time job here in Hobart. I have 2 weeks off over Christmas, so if no-one's looked at it by then I'll have a peek. We're talking about #2001 right?

@clanmills
Copy link
Collaborator

Sorry to hear you've got a job. I've been retired since 2014. Wild horses couldn't drag me back into the hell of work. It's not the work, you understand. It's the behaviour of the corporate politicians. No more. Never again.

I'm "sort of" thinking about #2001. However the real subject is recovering the preview images in JPEGs.

I think #2001 is now a 1-liner. We should ignore the count in the metadata and can throw away a few lines of code.

Let's discuss it carefully when you're ready to work on it.

(Or, I could behave like a boss and demand why you haven't fixed that yet!).

@kevinbackhouse
Copy link
Collaborator Author

@clanmills: I have been looking at the image referenced in #2001. It's similar to what you describe here. The main image is 4000x6000 and the preview is 1080x1616. The preview starts at byte number 19091968. So I can get a valid 1080x1616 jpg image by dropping the first 19091968 bytes, like this:

dd if=DSC01235.JPG of=test.jpg ibs=1 skip=19091968

The thing that I don't understand is that the offset in the PreviewImage tag is 19092086, which is 118 bigger than 19091968. Do you have any idea what the offset in the tag is supposed to be? Is supposed to be a byte offset from the start of the file?

@clanmills
Copy link
Collaborator

clanmills commented Dec 4, 2021

@kevinbackhouse I've also been puzzled by the offset. The offset enables quick access to the preview without have to crawl through the JPEG/SOS segment byte-by-byte looking for 0xffd9 which is the JPEG/EOI marker. Why is it incorrect? I don't know. Maybe we should simply search for 0xffd9 starting at offset!

I think you'll agree that for 0x2001, we should ignore the count field and allocate nothing.

David @dhoulder might look at recovering those previews during "The Holidays" . He's in Australia and I'm sure has more fun things like the BBQ and Big Bash Cricket to amuse him when he escapes from the office for a couple of weeks.

If David works on this, I'll discuss the preview manager with him one-to-one. We only need to hunt for those JPEG/previews when options such as -pp are requested. Let's not digress into the implications of this yet.

@kevinbackhouse
Copy link
Collaborator Author

@clanmills: I just created #2015. Hopefully I have understood your comment above correctly, and that's the stopgap solution that you have in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants