Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Galaxy tool wrapper for Visinity #86

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

Galaxy tool wrapper for Visinity #86

wants to merge 10 commits into from

Conversation

CameronFRWatson
Copy link
Contributor

Galaxy tool wrapper for visinity

There are two tools. One pre-processes inputs and creates an archive. The second is an interactive tool for viewing the Visinity dashboard


This PR is related to

  • Adding a new tool
  • Updating an existing tool to a newer version
  • Fixing a bug or updating just the Galaxy wrapper of an existing tool
  • Making a change to the tools-mti repo, CI, or other misc. change

If updating an existing tool to a newer major version

  • I have updated the TOOL_VERSION token in the tool's macros file
  • I have reset the VERSION_SUFFIX to 0 in the tool's macros file

Provide details here

  • Adding as a tool suite since there are two tools, although unsure if this is the right move

<macros>
<xml name="requirements">
<requirements>
<container type="docker">quay.io/sargentl/vis_test:0.0.3</container>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this still be named "test"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope! this PR isn't quite ready for merger, i still have cleanup to do. this is a fork of mine as visinity had a pretty serious show-stopping bug when i started developing this IT; they've since updated it, so my plan is to ensure the new update works then use that container

Comment on lines 9 to 11
<stdio>
<exit_code range="1:" level="fatal" description="Error occurred. Please check Tool Standard Error" />
</stdio>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed by detect_errors on the command tag.

cp inputs.json /visinity/ &&
cd /visinity/ &&
cp "$__tool_directory__/archive_generate.py" "./archive_generate.py" &&
CONDAFY="micromamba run /bin/bash -c" &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is mamba here doing?
Please note that mamba is deprecated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i used a micromamba base image for easy installation while developing, as conda was visinity's recommended installation method. re: deprecation, do you have links for that? i would love to read more and wasn't able to find much with a cursory googling. the main mamba repo seems active; the closest i could find is this conversation:

...where they debate the validity/utility of mamba, but seem to support continuing with micromamba development

what is your preference here? i don't super care what we use as long as it works.

<inputs name='inputs' filename='inputs.json' data_style='paths'/>
</configfiles>
<inputs>
<param label='Dataset Name CSV' name='dataset_names' type='data' format='csv' help="Single column CSV file with IDs for each image in the cohort under column titled names"/>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"name" could be misleading, maybe Dataset with image IDs?

</output>
<!-- FIXME: replace with chunk below once CI is upgraded to Galaxy v24 -->
<!-- <output name="out_archive">
<has_archive_member path="./config.json"/>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has_archive_member should work in older Galaxy releases I think

requests = app.test_client()

args = sys.argv[1:]
print(f"args: {args}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more debug cruft ;(

will remove

@@ -0,0 +1,69 @@
<tool id="visinity_postprocess" name="Visualize Visinity Archive" tool_type="interactive" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="22.01">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interactive tools should not go into the TS (yet). Please submit them to Galaxy directly.

Maybe it would be also nice to create a visinity.zip datatype?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good; i selected zip on the assumption that it should be a datatype already in galaxy so admins wouldn't have to add a custom datatype, but if the IT can be added to galaxy directly, so too can the datatype definition.

two concerns:

  • anyone who wants to use the tool before PR acceptance would have to pull that branch or modify galaxy manually
  • would it be backported to previous versions of galaxy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, does:

Please submit them to Galaxy directly
... mean this should NOT go in this repo? or simply that it should be omitted from TS inclusion, and still go in this repo?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new datatype is not complicated. In your case it only changing the datatypes_conf.xml file and adding one line that subclasses from zip.

anyone who wants to use the tool before PR acceptance would have to pull that branch or modify galaxy manually

You would need to copy this file into your Galaxy, yes. Most deployments run from a fork on the main repo eitherway, I think.

would it be backported to previous versions of galaxy?

You could try. I don't think we have a rule for this yet.
ITs should go to the TS at some point we just never got time to finish that off.

... mean this should NOT go in this repo? or simply that it should be omitted from TS inclusion, and still go in this repo?

You can host it here, if you like. But it should not hit the TS. I would maintain it upstream in the Galaxy main repo.


Running Visinity in Galaxy requires execution of two separate tools in succession:

1. Create Visinity Archive
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of interest how long does the first tool take? Is it computational intensive?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a couple reasons for the two step process:

  1. as you suspected, time/compute is a factor; as an example, a medium sized invocation on an 8vCPU/64GB RAM took about 15 minutes. with the two step process, you only have to create the archive once, then the IT can start visualizations immediately, and it can be re-visualized at a later time without repeating this processing.

  2. the pre processing is much more resource intensive than visualization, and a visualization session can last hours as a user pokes around the UI; the cost savings on our AWS infrastructure is pretty big if we use a more modest instance

happy to hear thoughts/alternatives! we also wanted the visualizer to (eventually) be able to output things the user specified in the UI; we could unify the tools i suppose; something like:

  • if user is preprocessing, output an archive in addition to other output
    • user waits a while for the IT to start
  • if the user provides a preprocessed archive, skip preprocessing
    • the user does not wait a while for the IT to start

imo it comes down to opinion/convention as to which route to take. i am happy to do whatever everyone feels is best, this is merely my minimum viable product pass :D

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes total sense! We should not waste a lot of memory and CPUs for the Visualisation. Thanks for the details!

<import>macros.xml</import>
</macros>
<expand macro="requirements"/>
<expand macro="stdio"/>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<expand macro="stdio"/>

@CameronFRWatson CameronFRWatson marked this pull request as draft August 16, 2024 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants