Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging tools run with defaults vs customised settings #2798

Open
peterjc opened this issue Aug 16, 2016 · 10 comments
Open

Logging tools run with defaults vs customised settings #2798

peterjc opened this issue Aug 16, 2016 · 10 comments

Comments

@peterjc
Copy link
Contributor

peterjc commented Aug 16, 2016

Suggested by @nickschurch during Q&A session at a NextGenBUG meeting in Scotland,
http://nextgenbug.org/next-meeting-glasgow-16th-august-2016 https://twitter.com/pjacock/status/765565553692082177

Many Galaxy tools have an "advanced options" setup as well, and it would be interesting to record how often this is used too.

(I'm envisioning this being exposed in the existing Galaxy reports system somehow)

@nsoranzo
Copy link
Member

I think this is in the scope of the Galactic Radio Telescope, ping @erasche.

@peterjc
Copy link
Contributor Author

peterjc commented Aug 16, 2016

@martenson wrote:

This can be extracted from existing jobinfo/database. Not sure how this would be used as a feature. In Reports?

https://twitter.com/martenson/status/765567740736405504

Galaxy does record the settings used, but I suspect it is not explicit if there were defaults or not - and also defaults can change as tool XML is updated - so might not be possible to get this from the existing database logging.

@martenson
Copy link
Member

martenson commented Aug 16, 2016

I think Galaxy does record these because it has the re-run ability that fills in the used settings.

edit: I misread your comment, please disregard mine @peterjc

@martenson
Copy link
Member

It is probably not explicit whether the settings were defaults but you could cross-check it with the xml of the tool itself which has this information.

@nickschurch
Copy link

it records the command parameters, but does it explicitly record whether
and of the 'advanced' non-required default parameters were changed by the
user or not?

On Tue, 16 Aug 2016 at 16:21 Martin Cech notifications@github.com wrote:

I think Galaxy does record these because it has the re-run ability that
fills in the used settings.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2798 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACHCLovveDy6M1u8arj0Cf7KLyy_nHMMks5qgdWBgaJpZM4JlhTM
.

@peterjc
Copy link
Contributor Author

peterjc commented Aug 16, 2016

https://github.com/peterjc/pico_galaxy/blob/master/tools/protein_analysis/signalp3.xml is a simple example with three input parameters:

<param name="fasta_file" type="data" format="fasta" label="FASTA file of protein sequences"/>

Probably doesn't make sense to talk about a default for files (although could interpret this as the most recent file of that type as per the UI behaviour)?

<param name="organism" type="select" display="radio" label="Organism">
            <option value="euk">Eukaryote</option>
            <option value="gram+">Gram positive</option>
            <option value="gram-">Gram negative</option>
        </param>

Here the default is euk, versus two non-default choices gram+ and gram-.

<param name="truncate" type="integer" label="Truncate sequences to this many amino acids" value="70" help="Use zero for no truncation, maximum value 6000">
            <validator type="in_range" min="0" max="6000" message="Truncation value should be at most 6000. Use zero for no truncation."/>
        </param>

Here the default is 70 and anything else would be non-default.

Galaxy could record the settings in the database (as used for re-running the tool, or building a workflow), but I don't think it records if they were default values or not. Currently you'd have to refer to the tool XML file to know the defaults were euk and 70.

@hexylena
Copy link
Member

@peterjc definitely in the scope of GRT. Here's the dev instance https://oc.hx42.org/grt/, which does not currently expose this information, but that is on the roadmap. I would like to be able to see and analyse all of the parameters that people are running my tool with and break them down.

The hope is that people will contribute their job run data (maybe removing private tools / other parameters they deem private), and we will be able to share the rest of the information with the world for analysis (input file sizes, runtime, allocations, parameters). Then you'd be able to take a copy of the data, and analyse it however you want, e.g. tables of % of people using which parameters.

@peterjc
Copy link
Contributor Author

peterjc commented Aug 16, 2016

Nick and I suspect you could infer the default settings from that kind of large dataset, simply because we expect them to be the most commonly used (although perhaps that might have to exclude tools run via workflows which could set different values).

@hexylena
Copy link
Member

I'll see if we can start tagging workflows based tools.

@mvdbeek
Copy link
Member

mvdbeek commented Oct 9, 2018

It is probably not explicit whether the settings were defaults but you could cross-check it with the xml of the tool itself which has this information.

That's correct, but as long as the tool is still installed it's trivial getting the default values. In fact I think it would be good if by default we have a subtle highlight of the non-default settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants