Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TIFF Metadata for Offsets #19

Merged
merged 16 commits into from
Apr 3, 2020
Merged
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
**/test-output-actual

settings.json

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
5 changes: 4 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,11 @@
# We just need to use --file to point at it, instead of assuming it is in context.

# Using Conda because pyarrow did not install easily on python base images.
FROM continuumio/miniconda3
FROM continuumio/miniconda3:4.7.12

# For tiff packages
RUN apt-get update &&\
apt-get install -y gcc python3-dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I was thinking we could use the same base image for everything... but if we end up with a lot of installs for just one pipeline or another, then might not be the right right.

No action for now, but something to keep in mind.

COPY requirements-freeze.txt .
RUN pip install -r ./requirements-freeze.txt

Expand Down
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
Docker containers to pre-process data for visualization in the portal.

The subdirectories in this repo all have the same structure:

- `context/`: A Docker context, including at least
`main.py`, `requirements.txt`, and `requirements-freeze.txt`.
`main.py`, `requirements.txt`, and `requirements-freeze.txt`.
- `test-input/`, `test-output-actual/`, `test-output-expected/`: Test fixtures.
- `VERSION`: contains a semantic version number
- and a `README.md`.
Expand All @@ -15,6 +16,7 @@ Images are named by the containing directory.
Running `test.sh` will build (and test!) all the images.
You can then define `$INPUT_DIR`, `$OUTPUT_DIR`, and `$IMAGE`
to run an image with your own data:

```
docker run \
--mount type=bind,source=$INPUT_DIR,target=/input \
Expand All @@ -23,6 +25,7 @@ docker run \
```

To push the latest versions to dockerhub just run:

```
test.sh push
```
mccalluc marked this conversation as resolved.
Show resolved Hide resolved
4 changes: 4 additions & 0 deletions containers/ome-tiff-offsets/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# ome-tiff-offsets

This docker container adds a structured annotation to the OMEXML
which contains the `IFD_Offsets` in bytes
1 change: 1 addition & 0 deletions containers/ome-tiff-offsets/VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.0.1
65 changes: 65 additions & 0 deletions containers/ome-tiff-offsets/context/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
import argparse
from glob import glob
from pathlib import Path
from os import makedirs

from aicsimageio import AICSImage
from aicsimageio.writers import ome_tiff_writer
from tifffile import TiffFile
import xml.dom.minidom

def get_offsets(tiff_filepath):
with TiffFile(tiff_filepath) as tif:
offsets = [page.offset for page in tif.pages]
return offsets



def main(input_dir, output_dir):
makedirs(output_dir, exist_ok=True)
for input in glob(input_dir + '/*.ome.tif*') + glob(input_dir + '/*.ome.tiff'):
# Get image metadata and image data.
with AICSImage(input) as input_image:
image_data_from_input = input_image.get_image_data()[0]
omexml = input_image.metadata
# Create the output path for the compressed ome tiff.
input_path = Path(input)
compressed_dir = Path('/compressed/')
compressed_ome_tiff = compressed_dir / input_path.name
makedirs(compressed_dir)
with ome_tiff_writer.OmeTiffWriter(compressed_ome_tiff) as ome_writer:
ome_writer.save(
image_data_from_input,
ome_xml = omexml
)
# Read in the newly compressed file.
with AICSImage(compressed_ome_tiff) as compressed_image:
image_data_from_compressed = compressed_image.get_image_data()[0]
omexml_compressed = compressed_image.metadata
# Get the offsets of said compressed file and add them to the omexml as structured annotations.
offsets = get_offsets(compressed_ome_tiff)
structured_annotations = omexml_compressed.structured_annotations
structured_annotations.add_original_metadata(key='IFD_Offsets', value=str(offsets))
# Write the file out to the bound output directory.
with open(Path(output_dir) / 'ome.xml', 'w') as xml_write:
xml_write.write(xml.dom.minidom.parseString(str(omexml_compressed)).toprettyxml())
new_ome_tiff_path = Path(output_dir) / input_path.name
with ome_tiff_writer.OmeTiffWriter(new_ome_tiff_path, overwrite_file=True) as ome_writer:
ome_writer.save(
image_data_from_compressed,
ome_xml = omexml_compressed
)



if __name__ == '__main__':
parser = argparse.ArgumentParser(
description='Add offsets to the OMEXML Metadata')
parser.add_argument(
'--input_dir', required=True,
help='Directory containing ome-tiff files to read')
parser.add_argument(
'--output_dir', required=True,
help='Directory where ome-tiff files should be written')
args = parser.parse_args()
main(args.input_dir, args.output_dir)
48 changes: 48 additions & 0 deletions containers/ome-tiff-offsets/context/requirements-freeze.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
aicsimageio==3.1.4
aicspylibczi==2.5.1
asn1crypto==1.0.1
certifi==2019.9.11
cffi==1.12.3
chardet==3.0.4
click==7.1.1
cloudpickle==1.3.0
conda==4.7.12
conda-package-handling==1.6.0
cryptography==2.7
cycler==0.10.0
dask==2.13.0
decorator==4.4.2
distributed==2.13.0
HeapDict==1.0.1
idna==2.8
imagecodecs==2020.2.18
imageio==2.8.0
kiwisolver==1.1.0
lxml==4.5.0
matplotlib==3.2.1
msgpack==1.0.0
networkx==2.4
numpy==1.18.2
Pillow==7.0.0
psutil==5.7.0
pycosat==0.6.3
pycparser==2.19
pyOpenSSL==19.0.0
pyparsing==2.4.6
PySocks==1.7.1
python-dateutil==2.8.1
PyWavelets==1.1.1
PyYAML==5.3.1
requests==2.22.0
ruamel-yaml==0.15.46
scikit-image==0.16.2
scipy==1.4.1
six==1.14.0
sortedcontainers==2.1.0
tblib==1.6.0
tifffile==2020.2.16
toolz==0.10.0
tornado==6.0.4
tqdm==4.36.1
urllib3==1.24.2
zict==2.0.0
2 changes: 2 additions & 0 deletions containers/ome-tiff-offsets/context/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
aicsimageio==3.1.4
tifffile==2020.2.16
mccalluc marked this conversation as resolved.
Show resolved Hide resolved
Binary file not shown.
Binary file not shown.
29 changes: 29 additions & 0 deletions containers/ome-tiff-offsets/test-output-expected/ome.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
<?xml version="1.0" ?>
<OME Creator="OME Bio-Formats 5.2.2" UUID="urn:uuid:bebd2be7-8253-4b90-be93-4df567ffe1be" xmlns="http://www.openmicroscopy.org/Schemas/OME/2016-06" xmlns:ns2="None" xmlns:ns3="openmicroscopy.org/OriginalMetadata" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openmicroscopy.org/Schemas/OME/2016-06 http://www.openmicroscopy.org/Schemas/OME/2016-06/ome.xsd">
<Image ID="Image:0" Name="multi-channel.ome.tif">
<Pixels BigEndian="true" DimensionOrder="XYZCT" ID="Pixels:0" SizeC="3" SizeT="1" SizeX="439" SizeY="167" SizeZ="1" Type="int8">
<Channel ID="Channel:0:0" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:1" SamplesPerPixel="1">
<LightPath/>
</Channel>
<Channel ID="Channel:0:2" SamplesPerPixel="1">
<LightPath/>
</Channel>
<TiffData FirstC="0" FirstT="0" FirstZ="0" IFD="0" PlaneCount="1"/>
<TiffData FirstC="1" FirstT="0" FirstZ="0" IFD="1" PlaneCount="1"/>
<TiffData FirstC="2" FirstT="0" FirstZ="0" IFD="2" PlaneCount="1"/>
</Pixels>
</Image>
<ns2:StructuredAnnotations>
<ns2:XMLAnnotation ID="PLACEHOLDER">
<ns2:Value>
<ns3:OriginalMetadata>
<ns3:Key>IFD_Offsets</ns3:Key>
<ns3:Value>[8, 2712, 4394]</ns3:Value>
</ns3:OriginalMetadata>
</ns2:Value>
</ns2:XMLAnnotation>
</ns2:StructuredAnnotations>
</OME>
12 changes: 7 additions & 5 deletions test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ die() { set +v; echo "$red$*$reset" 1>&2 ; exit 1; }

build_test() {
TAG=$1
BASENAME=$2
docker build --file ../../Dockerfile --tag $TAG context
PWD_BASE=`basename $PWD`
docker rm -f $PWD_BASE || echo "No container to stop"
Expand All @@ -25,11 +26,12 @@ build_test() {

# hexdump -C test-output-expected/2x2.arrow > test-output-expected/2x2.arrow.hex.txt
# hexdump -C test-output-actual/2x2.arrow > test-output-actual/2x2.arrow.hex.txt

if [ "$BASENAME" == "ome-tiff-offsets" ]; then
sed -i.bak 's/XMLAnnotation ID="[^"]*"/XMLAnnotation ID="PLACEHOLDER"/g' test-output-actual/ome.xml
fi
diff -w -r test-output-expected test-output-actual \
--exclude=.DS_Store --exclude=*.arrow \
| head -n100 | cut -c 1-100

--exclude=.DS_Store --exclude=*.arrow --exclude=*.ome.tif* \
--exclude=ome.xml.bak | head -n100 | cut -c 1-100
diff <( docker run $TAG pip freeze ) context/requirements-freeze.txt \
|| die "Update dependencies:
docker run $TAG pip freeze > $TAG/context/requirements-freeze.txt"
Expand All @@ -46,7 +48,7 @@ for DIR in containers/*; do
# Neither underscores nor double dash is allowed:
# Don't get too creative!
TAG="hubmap/portal-container-$BASENAME:$VERSION"
build_test $TAG
build_test $TAG $BASENAME
if [ "$1" == 'push' ]; then
COMMAND="docker push $TAG"
echo "$green$COMMAND$reset"
Expand Down