Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Reproducing Xenium V1 Data Using SOPA v1.1.12 #105

Closed
bhyu0217 opened this issue Aug 9, 2024 · 5 comments
Closed

Issue with Reproducing Xenium V1 Data Using SOPA v1.1.12 #105

bhyu0217 opened this issue Aug 9, 2024 · 5 comments

Comments

@bhyu0217
Copy link

bhyu0217 commented Aug 9, 2024

Hello,

First, I would like to express my sincere appreciation for your excellent work and dedication.

Earlier this year, I performed cell segmentation on Xenium v1 data using SOPA v1.0.5 with Cellpose + Baysor (snakemake), and I was very satisfied with the results.
However, I encountered an issue while trying to reproduce these results using SOPA v1.1.12.
Despite modifying the "channels: ["DAPI"]" entry in the config.yaml file to "channels: [0]" and making additional code adjustments, the error persists.

Below is the log file.


(sopa) snakemake --config data_path=/raw_data/0007055_sf_01900_b2_c/ --configfile=/sopa/workflow/cellpose_baysor.yaml --cores 1 --use-conda

SpatialData object path set to default: /raw_data/0007055_sf_01900_b2_c.zarr
To change this behavior, provide --config sdata_path=... when running the snakemake pipeline
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count
aggregate 1
all 1
explorer 1
image_write 1
patchify_baysor 1
patchify_cellpose 1
report 1
resolve_baysor 1
resolve_cellpose 1
to_spatialdata 1
total 10

Select jobs to execute...

[Thu Aug 8 18:36:56 2024]
rule to_spatialdata:
input: /raw_data/0007055_sf_01900_b2_c
output: /raw_data/0007055_sf_01900_b2_c.zarr/.zgroup
jobid: 4
reason: Missing output files: /raw_data/0007055_sf_01900_b2_c.zarr/.zgroup
resources: tmpdir=/scratch/bhyu0217, mem_mb=128000, mem_mib=122071

Activating conda environment: sopa
/c4/home/bhyu0217/anaconda3/envs/sopa/lib/python3.10/functools.py:926: UserWarning: The index of the dataframe is not monotonic increasing. It is recommended to sort the data to adjust the order of the index before calling .parse() to avoid possible problems due to unknown divisions
return method.get(obj, cls)(*args, **kwargs)

[INFO] (sopa.io.standardize) Writing the following spatialdata object to /raw_data/0007055_sf_01900_b2_c.zarr:
SpatialData object
├── Images
│ └── 'morphology_focus': DataTree[cyx] (1, 34152, 34213), (1, 17076, 17106), (1, 8538, 8553), (1, 4269, 4276), (1, 2134, 2138)
└── Points
└── 'transcripts': DataFrame with shape: (, 10) (3D points)
with coordinate systems:
▸ 'global', with elements:
morphology_focus (Images), transcripts (Points)
INFO The Zarr backing store has been changed from None the new file path:
/raw_data/0007055_sf_01900_b2_c.zarr

[Thu Aug 8 18:38:23 2024]
Finished job 4.
1 of 10 steps (10%) done
Select jobs to execute...

[Thu Aug 8 18:38:23 2024]
checkpoint patchify_cellpose:
input: /raw_data/0007055_sf_01900_b2_c.zarr/.zgroup
output: /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/patches_file_image, /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/patches
jobid: 6
reason: Missing output files: /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/patches_file_image; Input files updated by another job: /raw_data/0007055_sf_01900_b2_c.zarr/.zgroup
resources: tmpdir=/scratch/bhyu0217
DAG of jobs will be updated after completion.

Activating conda environment: sopa
[INFO] (sopa.patches.patches) 36 patches were saved in sdata['sopa_patches']
Touching output file /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/patches.

[Thu Aug 8 18:38:29 2024]
Finished job 6.
2 of 10 steps (20%) done
Select jobs to execute...

[Thu Aug 8 18:38:29 2024]
rule image_write:
input: /raw_data/0007055_sf_01900_b2_c.zarr/.zgroup
output: /raw_data/0007055_sf_01900_b2_c.explorer/morphology.ome.tif
jobid: 8
reason: Missing output files: /raw_data/0007055_sf_01900_b2_c.explorer/morphology.ome.tif; Input files updated by another job: /raw_data/0007055_sf_01900_b2_c.zarr/.zgroup
resources: tmpdir=/scratch/bhyu0217, mem_mb=64000, mem_mib=61036, partition=longq

Activating conda environment: sopa
[WARNING] (sopa._sdata) sdata object has no valid segmentation boundary. Consider running Sopa segmentation first.
[INFO] (sopa.io.explorer.images) Writing multiscale image with procedure=semi-lazy (load in memory when possible)
[INFO] (sopa.io.explorer.images) (Loading image of shape (1, 34152, 34213)) in memory
[INFO] (sopa.io.explorer.images) > Image of shape (1, 34152, 34213)
[INFO] (sopa.io.explorer.images) > Image of shape (1, 17076, 17106)
[INFO] (sopa.io.explorer.images) > Image of shape (1, 8538, 8553)
[INFO] (sopa.io.explorer.images) > Image of shape (1, 4269, 4276)
[INFO] (sopa.io.explorer.images) > Image of shape (1, 2134, 2138)
[INFO] (sopa.io.explorer.images) > Image of shape (1, 1067, 1069)
[INFO] (sopa.io.explorer.converter) Saved files in the following directory: /raw_data/0007055_sf_01900_b2_c.explorer
[INFO] (sopa.io.explorer.converter) You can open the experiment with 'open /raw_data/0007055_sf_01900_b2_c.explorer/experiment.xenium'

[Thu Aug 8 18:39:25 2024]
Finished job 8.
3 of 46 steps (7%) done
Select jobs to execute...

[Thu Aug 8 18:39:25 2024]
rule patch_segmentation_cellpose:
input: /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/patches_file_image, /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/patches
output: raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/cellpose_boundaries/10.parquet
jobid: 23
reason: Missing output files: /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/cellpose_boundaries/10.parquet
wildcards: index=10
resources: tmpdir=/scratch/bhyu0217

Activating conda environment: sopa
╭────────────────────────────── Traceback (most recent call last) ──────────────────────────────╮
│ /c4/home/bhyu0217/anaconda3/envs/sopa/lib/python3.10/site-packages/sopa/cli/segmentation.py:7 │
│ 3 in cellpose │
│ │
│ 70 │ │ **method_kwargs, │
│ 71 │ ) │
│ 72 │ │
│ ❱ 73 │ run_staining_segmentation( │
│ 74 │ │ sdata_path, │
│ 75 │ │ SopaKeys.CELLPOSE_BOUNDARIES, │
│ 76 │ │ method, │
│ │
│ ╭───────────────────────────────────────── locals ──────────────────────────────────────────╮ │
│ │ cellpose_patch = <function cellpose_patch at 0x7f1c06271870> │ │
│ │ cellprob_threshold = -6.0 │ │
│ │ channels = ['DAPI'] │ │
│ │ clahe_kernel_size = None │ │
│ │ clip_limit = 0.2 │ │
│ │ diameter = 30.0 │ │
│ │ flow_threshold = 2.0 │ │
│ │ gaussian_sigma = 1.0 │ │
│ │ method = <function cellpose_patch..
at 0x7f1bf8193130> │ │
│ │ method_kwargs = {} │ │
│ │ min_area = 400 │ │
│ │ model_type = 'cyto3' │ │
│ │ patch_dir = '/raw_data/… │ │
│ │ patch_index = 10 │ │
│ │ pretrained_model = None │ │
│ │ sdata_path = '/raw_data/… │ │
│ │ SopaKeys = <class 'sopa._constants.SopaKeys'> │ │
│ ╰───────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /c4/home/bhyu0217/anaconda3/envs/sopa/lib/python3.10/site-packages/sopa/cli/segmentation.py:1 │
│ 75 in _run_staining_segmentation │
│ │
│ 172 │ │
│ 173 │ sdata = read_zarr_standardized(sdata_path) │
│ 174 │ │
│ ❱ 175 │ segmentation = StainingSegmentation( │
│ 176 │ │ sdata, │
│ 177 │ │ method, │
│ 178 │ │ channels, │
│ │
│ ╭───────────────────────────────────────── locals ──────────────────────────────────────────╮ │
│ │ _default_boundary_dir = <function default_boundary_dir at 0x7f1cddfc7c70> │ │
│ │ channels = ['DAPI'] │ │
│ │ clahe_kernel_size = None │ │
│ │ clip_limit = 0.2 │ │
│ │ gaussian_sigma = 1.0 │ │
│ │ method = <function cellpose_patch..
at 0x7f1bf8193130> │ │
│ │ min_area = 400 │ │
│ │ patch_dir = '/raw_d… │ │
│ │ patch_index = 10 │ │
│ │ read_zarr_standardized = <function read_zarr_standardized at 0x7f1c08292050> │ │
│ │ sdata = SpatialData object, with associated Zarr store: │ │
│ │ /raw_da… │ │
│ │ ├── Images │ │
│ │ │ └── 'morphology_focus': DataTree[cyx] (1, 34152, 34213), │ │
│ │ (1, 17076, 17106), (1, 8538, 8553), (1, 4269, 4276), (1, 2134, │ │
│ │ 2138) │ │
│ │ ├── Points │ │
│ │ │ └── 'transcripts': DataFrame with shape: (, 10) │ │
│ │ (3D points) │ │
│ │ └── Shapes │ │
│ │ │ └── 'sopa_patches': GeoDataFrame shape: (36, 3) (2D │ │
│ │ shapes) │ │
│ │ with coordinate systems: │ │
│ │ │ ▸ 'global', with elements: │ │
│ │ │ │ morphology_focus (Images), transcripts (Points), │ │
│ │ sopa_patches (Shapes) │ │
│ │ sdata_path = '/raw_d… │ │
│ │ shapes_key = 'cellpose_boundaries' │ │
│ │ StainingSegmentation = <class 'sopa.segmentation.stainings.StainingSegmentation'> │ │
│ ╰───────────────────────────────────────────────────────────────────────────────────────────╯ │
│ │
│ /c4/home/bhyu0217/anaconda3/envs/sopa/lib/python3.10/site-packages/sopa/segmentation/staining │
│ s.py:92 in init
│ │
│ 89 │ │ image_channels = self.image.coords["c"].values │
│ 90 │ │ assert np.isin( │
│ 91 │ │ │ channels, image_channels │
│ ❱ 92 │ │ ).all(), f"Channel names must be a subset of: {', '.join(image_channels)}" │
│ 93 │ │
│ 94 │ def run_patch(self, patch: Polygon) -> list[Polygon]: │
│ 95 │ │ """Run segmentation on one patch │
│ │
│ ╭───────────────────────────────────────── locals ──────────────────────────────────────────╮ │
│ │ channels = ['DAPI'] │ │
│ │ clahe_kernel_size = None │ │
│ │ clip_limit = 0.2 │ │
│ │ gaussian_sigma = 1.0 │ │
│ │ image_channels = array([0]) │ │
│ │ image_key = None │ │
│ │ method = <function cellpose_patch..
at 0x7f1bf8193130> │ │
│ │ min_area = 400 │ │
│ │ sdata = SpatialData object, with associated Zarr store: │ │
│ │ /raw_data/00… │ │
│ │ ├── Images │ │
│ │ │ └── 'morphology_focus': DataTree[cyx] (1, 34152, 34213), (1, │ │
│ │ 17076, 17106), (1, 8538, 8553), (1, 4269, 4276), (1, 2134, 2138) │ │
│ │ ├── Points │ │
│ │ │ └── 'transcripts': DataFrame with shape: (, 10) (3D │ │
│ │ points) │ │
│ │ └── Shapes │ │
│ │ │ └── 'sopa_patches': GeoDataFrame shape: (36, 3) (2D shapes) │ │
│ │ with coordinate systems: │ │
│ │ │ ▸ 'global', with elements: │ │
│ │ │ │ morphology_focus (Images), transcripts (Points), sopa_patches │ │
│ │ (Shapes) │ │
│ │ self = <sopa.segmentation.stainings.StainingSegmentation object at │ │
│ │ 0x7f1c154bd810> │ │
│ ╰───────────────────────────────────────────────────────────────────────────────────────────╯ │
╰───────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: sequence item 0: expected str instance, numpy.int64 found

[Thu Aug 8 18:39:33 2024]
Error in rule patch_segmentation_cellpose:
jobid: 23
input: /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/patches_file_image, /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/patches
output: /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/cellpose_boundaries/10.parquet
conda-env: sopa
shell: sopa segmentation cellpose /raw_data/0007055_sf_01900_b2_c.zarr --patch-dir /raw_data/0007055_sf_01900_b2_c.zarr/.sopa_cache/cellpose_boundaries --patch-index 10 --diameter 30 --channels "DAPI" --flow-threshold 2 --cellprob-threshold -6 --min-area 400

(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)


Do you have any suggestions on how to resolve this issue?

Thank you for your time and assistance.

@quentinblampey
Copy link
Collaborator

Hello @bhyu0217, thanks for reporting this bug. I think it was introduced when I updated Sopa to support the more recent versions of the Xenium data. The reader works on the new Xenium versions, but is probably broken for the older Xenium versions.

This should be a simple fix, I'll work on this in the next few days!

@quentinblampey
Copy link
Collaborator

Hi @bhyu0217, this should be fixed on the dev branch. Let me know if you want to test it out, else I'll probably make a new version release next week!

@bhyu0217
Copy link
Author

Hi @quentinblampey,

Thank you so much for your response and for working on the fix.
I really appreciate it!!! I'll test it out and let you know how it goes.

@bhyu0217
Copy link
Author

Hi @quentinblampey,
I have confirmed that the updated code works well with Xenium v1 and generates results without any issues.
Thank you very much for your help!
I'll go ahead and close this tissue.

@quentinblampey
Copy link
Collaborator

Good to hear, and thanks for noticing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants