Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use reader-supplied plate names by default #4992

Merged
merged 5 commits into from
Feb 16, 2017

Conversation

melissalinkert
Copy link
Member

What this PR does

This changes how Plates are named during import, so that the reader-supplied name is used whenever possible. If neither the reader nor the user supply a Plate name, then the relative file path is used as a fallback. Naming of Image-based datasets should not have changed, but needs to be tested.

Testing this PR

Without this PR, import one plate in each of a few different formats (e.g. InCell, ScanR, Flex). Verify that the plate names in OMERO all match the relative file path of the import file. Also import a non-plate dataset in each of a few different formats (e.g. Imaris HDF, SVS).

With this PR, import the same set of plate and non-plate datasets as in the original test. The Plate names should now match the Name attribute on Plate as shown by showinf -nopix -omexml on the corresponding file. The Image names (for non-plate datasets) should be the same as before.

Finally, with this PR, import the same set of plate datasets only, but specify a plate name of your choice during import. Verify that the Plate names match your chosen names and not what is shown by showinf -nopix -omexml.

Related reading

See https://trello.com/c/XMfgOciw/189-import-plate-image-name and ome/design#57.

The user specified name is typically non-null, even if the user did not
explicitly enter a plate name.  This attempts to distinguish between
"real" user specified names and default names based upon the imported
file name.

The default names appear to come from
ome.formats.importer.ImportFixture, but changes were not made there so
as to not affect non-SPW imports.
This removes the automatic setting of user-specified names in the
importer, so that later calls to retrieve the user-specified name are
guaranteed to return only what the user actually specified (or null by
default).  Plate and Image names supplied by the reader are trusted when
the user has not specified a name.  The name(s) will be set to the
relative path of the imported file only if both the user- and
reader-specified names are null.

The issue of 255 character limits on name fields has not been addressed
here, but a comment has been added so that it's easier to go back to
later.
@jburel jburel added the develop label Dec 13, 2016
@melissalinkert
Copy link
Member Author

Full list of plate-based readers for testing:

BDReader
CellH5Reader
CellomicsReader
CellVoyagerReader
CellWorxReader
FlexReader
InCellReader
MetamorphTiffReader
MIASReader
OMETiffReader
OMEXMLReader
OperettaReader
ScanrReader
ScreenReader

@jburel
Copy link
Member

jburel commented Jan 7, 2017

@pwalczysko now that almost everybody is back
Let's prepare a sheet to review this PR so we can split the work between few people (14 formats to test)
maybe 4/5 people involved so it does not become a burden.

@pwalczysko
Copy link
Member

@jburel : Sure. will do.
Question : ....specify a plate name of your choice during import. ... how would I do that ? Pass a parameter on CLI ? Is there such an option ? Definitely not in graphical importer...

@jburel
Copy link
Member

jburel commented Jan 9, 2017

you can use the CLI option
-n NAME, --name NAME Image or plate name to use (**)

@pwalczysko
Copy link
Member

@jburel Ta. Started to work on the testing template sheet for this. Will have to delete all the non-matching lines as well as add the lines asked for in this PR (e.g. CellH5) to the sheet. Will also try to adjust import testing template. Changing the folder these sheets are pointing to to curated as this is the most authoritative folder now thanks to @sbesson's work.

@melissalinkert
Copy link
Member Author

I managed to find the original dataset for CellVoyagerReader on a backup drive. In the process of creating a configuration and uploading now, will comment with the final location once upload is complete.

@melissalinkert
Copy link
Member Author

Datasets for CellVoyagerReader uploaded to data_repo/inbox/cell-voyager, but the reader will need some work before they can be moved to from inbox to curated. It might be best to not worry about that specific format for this PR, as I expect to have a large CellVoyagerReader-specific PR opened against Bio-Formats in the next week or so. Apologies for the trouble.

@pwalczysko
Copy link
Member

@melissalinkert : Thank you. Deleted CellVoyager from the testing sheet.

@pwalczysko
Copy link
Member

pwalczysko commented Jan 11, 2017

@melissalinkert : ome-tiff is other format which I have problems to find a plate in. Tried curated/ome-tiff/ian (because there is a plate subfolder, but this is all going into Orphaned images, i.e. not being recognized properly.
@rleigh-codelibre (who helped me to find the ian subfolder, thanks) suggested to produce an ome-tiff format which is a plate from a known format (e.g. bd-pathway) by running bf-convert. Is this the way to go here ? Thank you.

@pwalczysko
Copy link
Member

@melissalinkert : ditto for ome-xml

@rleigh-codelibre
Copy link
Contributor

@pwalczysko I think bfconvert the probably the best choice in the absence of any decent OME-TIFF plate examples.

@pwalczysko
Copy link
Member

pwalczysko commented Jan 11, 2017

@melissalinkert : ScreenReader seems to be a format which needs a lot of thinking and prep. According to @simleo this is an IDR specific format, which might/might not have its predecessor (which does not really work) here on the main branch. If the IDR (metadata branch) ScreenReader testing is asked for here, we would have to think how to enable the IDR reader here on this branch, which does not seem trivial. The data example as kindly provided by @simleo would be then /ome/data_repo/from_IDR/screen/tara/TARA_HCS1_H5_G100001472_G100001473--2013_09_28_19_45_25_chamber--U00--V01.screen

@pwalczysko
Copy link
Member

@melissalinkert : So the 3 remaining formats which need to be sorted out are

@mtbc
Copy link
Member

mtbc commented Jan 11, 2017

ome-tiff is other format which I have problems to find a plate in

There are ome/data_repo/curated/ome-tiff/ome-schemas/2010-06/*screen*.ome.tiff; I don't know if those are of any use.

@pwalczysko
Copy link
Member

@mtbc : Thank you, but no, unfortunately after import the ome/data_repo/curated/ome-tiff/ome-schemas/2010-06/*screen*.ome.tiff ended up in orphaned, no plate was created.

@melissalinkert
Copy link
Member Author

For OMEXMLReader, you might try https://github.com/ome/ome-model/blob/master/specification/samples/2016-06/one-screen-one-plate-four-wells.ome.xml. Using bfconvert to create an OME-XML file from a known plate in a different format ought to work in theory, but is likely to be slow and the resulting file will be unwieldy due to how pixel data is stored.

For OMETiffReader, the datasets under data_repo/curated/ome-tiff/ruben/ should all be plates. Test_Ruben_001/W0001/P001 in particular should work and be reasonably small. Using bfconvert to create a plate file from a known plate in a different format should work as an alternative if needed.

For ScreenReader, that's an oversight on my part - I mistakenly thought we had backported some of that from IDR already. As that reader is currently turned off in Bio-Formats' readers.txt, it should be safe to omit from testing.

@pwalczysko
Copy link
Member

@melissalinkert : Great, thank you. So on https://docs.google.com/spreadsheets/d/1CewWxkNMF-mLhM-zeiC7gqPmQfl93KYNb6QjHXh-4L0/edit#gid=1959616916 we should be sorted regarding the necessary data, cc @jburel I will just implement better the workflow into the spreadsheet and we can test next wekk.

@pwalczysko
Copy link
Member

@melissalinkert :

  • Is it fair to claim that the "relative file path" (from your description, paragraph "Without this PR...") is the same thing as the filename as shown by showinf in following example ?:
    screen shot 2017-01-12 at 17 38 07

  • What is the expected behaviour in case "With this PR..." if the Name parameter is empty just like in case in screenshot below ? (Is it going to show an empty name in the OMERO.client or default back to the relative file path ?)

screen shot 2017-01-12 at 17 38 46

@melissalinkert
Copy link
Member Author

Is it fair to claim that the "relative file path" (from your description, paragraph "Without this PR...") is the same thing as the filename as shown by showinf in following example ?

Yes, that's correct.

What is the expected behaviour in case "With this PR..." if the Name parameter is empty just like in case in screenshot below ? (Is it going to show an empty name in the OMERO.client or default back to the relative file path ?)

I think the correct behavior would be to fall back to the relative file path - if an empty name is shown in Insight/web, then an additional fix is needed.

@pwalczysko
Copy link
Member

I think the correct behavior would be to fall back to the relative file path - if an empty name is shown in Insight/web, then an additional fix is needed.

Albeit I agree that this should be the correct behaviour, this will complicate our testing a bit. I suppose many of the files we are going to use for the testing have an empty Name attribute...
Would you recommend to copy the files locally, change the attribute (what is the best way to do this ?) and then import with this PR ?

@pwalczysko
Copy link
Member

Metamorph removed from the testing sheet, as discussed with @melissalinkert .

@mtbc
Copy link
Member

mtbc commented Jan 13, 2017

👍 I don't see why this PR's effects would be format-specific.

@jburel
Copy link
Member

jburel commented Jan 14, 2017

I think this is a good opportunity to review some formats (we have not done that for a long time) and see if we need to improve/add documentation
Formats import is tested with jobs but we do not really check the output of the import.

@pwalczysko
Copy link
Member

Bug: Cellworx: ./showinf -nopix -omexml /Volumes/ome/data_repo/curated/cellworx/pki/SKMEL28_10K_1.HTD shows as a "Name" attribute "pki" on the file located on squig. But after import you get names like 14-13-49.466 which seem to have been derived from the import times, as these names are changing with the time of the day with the same imported file.

Bug was first noticed by @dominikl on [line 6](https://docs.google.com/spreadsheets/d/1CewWxkNMF-mLhM-zeiC7gqPmQfl93KYNb6QjHXh-4L0/edit#gid=19596169160 of the testing sheet. The bug repeats both with Insight with Options checkbox unchecked (which works for the other formats) as well as on CLI. The bug was then fully confirmed by me.

Interestingly, the "Name" parameter on plate in the showinf output changes with changing the name of the containing folder (Dom copied the file into his locla folder "Cellworx" and since then the showinf "Name" was reporting "Cellworx".

@pwalczysko
Copy link
Member

Bug: Line 11 of the testing sheet - in case of the empty string in an OME-XML format omero is supposed to fall back to behaviour before this PR - but for OME-XML plate format this is not the case. Note that this behaviour was not tested on other formats at all, because of the lack of appropriate data.

@pwalczysko
Copy link
Member

pwalczysko commented Jan 16, 2017

Note: (maybe treat as a separate bug): Further to #4992 (comment), this behaviour is analogical for the screen, i.e. the one-screen....xml file ends up in no-name screens in omero.

screen shot 2017-01-16 at 15 54 33

This should result in plate names defaulting to the relative file name
if a reader sets the plate name to an empty string.
@melissalinkert
Copy link
Member Author

7508291 should fix the empty plate name bug noted in #4992 (comment). #4992 (comment) is specific to the CellWorx reader and will need to be fixed in a separate Bio-Formats PR.

@pwalczysko
Copy link
Member

Note: In Insight, the majority of imports (except the bugs reported above) are passing too, but only if
the "Override...." box in the "Options" tab of the Importer is unchecked, as shown on the screenshot below. This is probably worth documenting as it might help users using the feature here in graphical client.
screen shot 2017-01-16 at 16 49 37

@pwalczysko
Copy link
Member

Failed checks, snoopy says the PR was excluded.

@melissalinkert
Copy link
Member Author

Travis fixed now, so should hopefully be in tomorrow's build.

@pwalczysko
Copy link
Member

Yes, the PR is in.
The retest of the bug #4992 (comment) with empty string in Name attribute is fixed. Tested on CLI, also with -n option again, with and without a string in the Name.

@pwalczysko
Copy link
Member

So to summarize, three separate PRs were advertised as a follow-up

Do we need to document the first and third bullet point here ?
Other than that, ready to merge fmpov.

@melissalinkert
Copy link
Member Author

The CellWorX follow-up PR is now open: ome/bioformats#2731

I have the CellVoyager work in progress and expect to have a PR open by the end of this week to target Bio-Formats 5.3.3. If the work is delayed for some reason, I will open a card instead.

@jburel
Copy link
Member

jburel commented Feb 2, 2017

PR ome/bioformats#2731 has been merged and will be available in 5.3.3
We can retest when OMERO consumes 5.3.3

@bramalingam
Copy link
Member

Tested Metamorph-Tiffs and CellWorX formats. The PR in its current status looks good to merge, if the CellVoyager PR can be regarded as a Bio-Formats reader specific issue. @jburel @melissalinkert

@jburel jburel merged commit c6c9989 into ome:develop Feb 16, 2017
@bramalingam
Copy link
Member

bramalingam commented Mar 14, 2017

The CellVoyager PR was tested as well, and it works as expected. (Bioformats version: 5.4.0-SNAPSHOT revision: 03de64c8b4762583d44ddad180167bab6fb4ba70 date: 14 March 2017)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants