Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract Genomic DNA versions 3.x+ do not link with built-in genome indexes #286

Open
jennaj opened this issue Feb 19, 2020 · 11 comments
Open
Assignees
Labels
functionality usegalaxy.org tool/dependency/function fix usegalaxy.org

Comments

@jennaj
Copy link
Member

jennaj commented Feb 19, 2020

Workaround for end-users:

Please use bedtools GetFastaBed instead. It performs the same basic function plus has more options.

Tool: Extract Genomic DNA using coordinates from assembled/unassembled genomes (Galaxy Version 3.0.3)

History:

  • Version 2.x (devteam original) was removed due to Python 3 incompatibilities
  • Version 3.x (iuc update) does not link-up with the proper data tables as structured at usegalaxy.org. We adjusted this for version 2.x a few years ago. Maybe the same can be done again?
  • Or, perhaps all versions of the tool should be removed from the server (hidden in tool panel) -- there is a very good alternative.

cc @natefoo @mvdbeek

Related tickets:

@jennaj jennaj added the functionality usegalaxy.org tool/dependency/function fix usegalaxy.org label Feb 19, 2020
@mvdbeek
Copy link
Member

mvdbeek commented Feb 19, 2020

  • Version 3.x (iuc update) does not link-up with the proper data tables as structured at usegalaxy.org. We adjusted this for version 2.x a few years ago. Maybe the same can be done again?

It just needs the twobit loc entries, are we producing them ? AFAIK we also need them for trackster

@jennaj
Copy link
Member Author

jennaj commented Mar 2, 2020

Three fixed problems, thank you!

  1. Extract Genomic DNA finds indexes at the versions available: 3.0.3 and 3.0.3+galaxy2.
  2. Versions 2.x no longer are listed in the tool panel.
  3. If a prior run using 2.x is "rerun", there is a warning and the newest version of the tool is loaded.

Screen Shot 2020-03-02 at 8 13 52 AM

However, there are two (minor) configuration problems

  1. The older version 3.0.3 is what is loaded when clicking on the tool from the tool panel.
  2. There is no warning that the newer version 3.0.3+galaxy2 is available on top of the 3.0.3 tool form.

Screen Shot 2020-03-02 at 8 05 43 AM

Update: Correct/most current tool version now loads from the tool panel

@jennaj
Copy link
Member Author

jennaj commented Dec 7, 2020

Update: Extract Genomic DNA version 3.0.3+galaxy2 is also now unlinked from indexes.

ping @davebx @natefoo

@jennaj
Copy link
Member Author

jennaj commented Dec 16, 2020

Retest history: https://usegalaxy.org/u/jen-galaxyproject/h/test-extract-genomic-dna

Status:

  • hg19 with bed or gtf input, fasta output = pass

  • hg19 with bed or gtf input, interval output = pending

  • mm9 with bed or gtf input = tool does not find the mm9 built-in index (?)

20201216_test1-extractgenomicdna-mm9-not-found

@mvdbeek
Copy link
Member

mvdbeek commented Dec 16, 2020

There is no mm9 twobit. You can go to the admin panel -> Local Data -> towbit (https://usegalaxy.org/admin/data_manager/table/twobit) to check the available indexes.

@jennaj
Copy link
Member Author

jennaj commented Dec 16, 2020

@mvdbeek thank you and agree. The larger problem is that existing twoBit data were not migrated out of the byhand dbkeys. Most if not all have the twoBit in the seq directory.

Probably also the root issues for this (Trackster). Punch line: cannot recognize dataset to add into an existing Trackster session, even when the dbkeys/database metadata match up. A single dataset Trackster is the only way it works for so many of the current/major genomes since they were indexed before DMs existed. #276

@jennaj
Copy link
Member Author

jennaj commented Dec 18, 2020

Update: More genomes with existing twoBit data in CVMFS byhand: ../dbkey/seq/dbkey.2bit are being populated into the proper loc by @davebx

hg19 and mm9 were both already specifically done. hg19 issues were more complex -- but mm9 is a "how-to-fix-it" model for the others.

@jennaj
Copy link
Member Author

jennaj commented Jan 7, 2021

Fixed, thanks all

@jennaj jennaj closed this as completed Jan 7, 2021
@jennaj jennaj added test/retest-pass passed retest and removed test/retest-do active tests labels Jan 7, 2021
@jennaj jennaj reopened this Jan 4, 2022
@jennaj
Copy link
Member Author

jennaj commented Jan 4, 2022

Update 2022-01-04

Some indexes are missing again at .org.

Example: hg38 missing, mm9 found

Test histories

ping @mvdbeek @davebx @natefoo

@jennaj
Copy link
Member Author

jennaj commented Jan 6, 2022

@davebx -- also checked the test server, and that particular 2bit is missing (weird). I created that file when first indexing hg38, and we still have it in CVMFS. Thought that was migrated from the old > new loc already, but maybe was missed. Don't see any others that are obvious/missing.

So, for now, just this line needs to be added to the byhand twobit.loc. I should still be able to do that, will SOS if get stuck: /cvmfs/data.galaxyproject.org/managed/seq/hg38.2bit

(Reminder for Jen: refgenie isn't configured to create 2bit files -- only lastz and extract still use it -- maybe we deprecate to save space (ask Dan) http://datacache.galaxyproject.org/refgenomes-databio/)

@jennaj jennaj removed the test/retest-pass passed retest label Jan 11, 2022
@jennaj
Copy link
Member Author

jennaj commented Jun 16, 2022

The tool indexes are missing again. Not sure if related to resent server updates.

Should we decide to deprecated this tool? Do any others besides lastz use .2bit indexes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functionality usegalaxy.org tool/dependency/function fix usegalaxy.org
Projects
None yet
Development

No branches or pull requests

4 participants