Skip to content

Commit

Permalink
Merge pull request galaxyproject#10175 from mvdbeek/ngram_by_default
Browse files Browse the repository at this point in the history
Use tool_enable_ngram_search by default, whitespace search fix
  • Loading branch information
martenson authored Sep 2, 2020
2 parents 2dc39a5 + bb4e524 commit 9d0e25b
Show file tree
Hide file tree
Showing 5 changed files with 114 additions and 6 deletions.
58 changes: 57 additions & 1 deletion doc/source/admin/galaxy_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1556,6 +1556,21 @@
:Type: str


~~~~~~~~~~~~~~~~~~~~~~~~~~~
``trs_servers_config_file``
~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
Allow import of workflows from the TRS servers configured in the
specified YAML or JSON file. The file should be a list with 'id',
'label', and 'api_url' for each entry. Optionally, 'link_url' and
'doc' may be be specified as well for each entry.
If this is null (the default), a simple configuration containing
just Dockstore will be used.
:Default: ``trs_servers_conf.yml``
:Type: str


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``user_preferences_extra_conf_path``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -2699,7 +2714,7 @@
Enable/ disable Ngram-search for tools. It makes tool search
results tolerant for spelling mistakes in the query by dividing
the query into multiple ngrams and search for each ngram
:Default: ``false``
:Default: ``true``
:Type: bool


Expand Down Expand Up @@ -3296,6 +3311,47 @@
:Type: bool


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``simplified_workflow_run_ui``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
If set to 'off' by default, always use the traditional workflow
form that renders all steps in the GUI and serializes the tool
state of all steps during invocation. Set to 'prefer' to default
to a simplified workflow UI that only renders the inputs if
possible (the workflow must have no disconnected runtime inputs
and not replacement parameters within tool steps). In the future
'force' may be added an option for Galaskio-style servers that
should only render simplified workflows.
:Default: ``prefer``
:Type: str


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``simplified_workflow_run_ui_target_history``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
When the simplified workflow run form is rendered, should the
invocation outputs be sent to the 'current' history or a 'new'
history.
:Default: ``current``
:Type: str


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``simplified_workflow_run_ui_job_cache``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
When the simplified workflow run form is rendered, should the
invocation use job caching. This isn't a boolean so an option for
'show-selection' can be added later.
:Default: ``off``
:Type: str


~~~~~~~~~~~~~~~~~~~~~~~~~~~
``myexperiment_target_url``
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
30 changes: 29 additions & 1 deletion lib/galaxy/config/sample/galaxy.yml.sample
Original file line number Diff line number Diff line change
Expand Up @@ -834,6 +834,14 @@ galaxy:
# format string as specified by ISO 8601 international standard).
#pretty_datetime_format: $locale (UTC)

# Allow import of workflows from the TRS servers configured in the
# specified YAML or JSON file. The file should be a list with 'id',
# 'label', and 'api_url' for each entry. Optionally, 'link_url' and
# 'doc' may be be specified as well for each entry.
# If this is null (the default), a simple configuration containing
# just Dockstore will be used.
#trs_servers_config_file: trs_servers_conf.yml

# Location of the configuration file containing extra user
# preferences.
# The value of this option will be resolved with respect to
Expand Down Expand Up @@ -1346,7 +1354,7 @@ galaxy:
# Enable/ disable Ngram-search for tools. It makes tool search results
# tolerant for spelling mistakes in the query by dividing the query
# into multiple ngrams and search for each ngram
#tool_enable_ngram_search: false
#tool_enable_ngram_search: true

# Set minimum size of ngrams
#tool_ngram_minsize: 3
Expand Down Expand Up @@ -1628,6 +1636,26 @@ galaxy:
# workflow.
#enable_unique_workflow_defaults: false

# If set to 'off' by default, always use the traditional workflow form
# that renders all steps in the GUI and serializes the tool state of
# all steps during invocation. Set to 'prefer' to default to a
# simplified workflow UI that only renders the inputs if possible (the
# workflow must have no disconnected runtime inputs and not
# replacement parameters within tool steps). In the future 'force' may
# be added an option for Galaskio-style servers that should only
# render simplified workflows.
#simplified_workflow_run_ui: prefer

# When the simplified workflow run form is rendered, should the
# invocation outputs be sent to the 'current' history or a 'new'
# history.
#simplified_workflow_run_ui_target_history: current

# When the simplified workflow run form is rendered, should the
# invocation use job caching. This isn't a boolean so an option for
# 'show-selection' can be added later.
#simplified_workflow_run_ui_job_cache: 'off'

# The URL to the myExperiment instance being used (omit scheme but
# include port).
#myexperiment_target_url: www.myexperiment.org:80
Expand Down
7 changes: 7 additions & 0 deletions lib/galaxy/tools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -603,6 +603,13 @@ def is_latest_version(self):
tool_versions = self.tool_versions
return not tool_versions or self.version == self.tool_versions[-1]

@property
def latest_version(self):
if self.is_latest_version:
return self
else:
return self.app.tool_cache.get_tool_by_id(self.lineage.get_versions()[-1].id)

@property
def is_datatype_converter(self):
return self in self.app.datatypes_registry.converter_tools
Expand Down
23 changes: 20 additions & 3 deletions lib/galaxy/tools/search/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,12 +82,31 @@ def build_index(self, tool_cache, index_help=True):
# Index ocasionally contains empty stored fields
indexed_tool_ids = {f['id'] for f in reader.all_stored_fields() if f}
tool_ids_to_remove = (indexed_tool_ids - set(tool_cache._tool_paths_by_id.keys())).union(tool_cache._removed_tool_ids)
for indexed_tool_id in indexed_tool_ids:
indexed_tool = tool_cache.get_tool_by_id(indexed_tool_id)
if not indexed_tool:
tool_ids_to_remove.add(indexed_tool_id)
continue
if not indexed_tool.is_latest_version and not indexed_tool.latest_version.hidden:
tool_ids_to_remove.add(indexed_tool_id)
with AsyncWriter(self.index) as writer:
for tool_id in tool_ids_to_remove:
writer.delete_by_term('id', tool_id)
for tool_id in tool_cache._new_tool_ids - indexed_tool_ids:
tool = tool_cache.get_tool_by_id(tool_id)
if tool and tool.is_latest_version:
if tool.hidden:
# we check if there is an older tool we can return
if tool.lineage:
for tool_version in reversed(tool.lineage.get_versions()):
tool = tool_cache.get_tool_by_id(tool_version.id)
if tool and not tool.hidden:
tool_id = tool.id
break
else:
continue
else:
continue
add_doc_kwds = self._create_doc(tool_id=tool_id, tool=tool, index_help=index_help)
writer.update_document(**add_doc_kwds)
log.debug("Toolbox index finished %s", execution_timer)
Expand Down Expand Up @@ -154,13 +173,11 @@ def search(self, q, tool_name_boost, tool_section_boost,
og = OrGroup.factory(0.9)
self.parser = MultifieldParser(['name', 'description', 'section', 'help', 'labels', 'stub'], schema=self.schema, group=og)
cleaned_query = q.lower()
# Replace hyphens, since they are wildcards in Whoosh causing false positives
if cleaned_query.find('-') != -1:
cleaned_query = (' ').join(token.text for token in self.rex(to_unicode(cleaned_query)))
if tool_enable_ngram_search is True:
rval = self._search_ngrams(cleaned_query, tool_ngram_minsize, tool_ngram_maxsize, tool_search_limit)
return rval
else:
cleaned_query = ' '.join(token.text for token in self.rex(cleaned_query))
# Use asterisk Whoosh wildcard so e.g. 'bow' easily matches 'bowtie'
parsed_query = self.parser.parse(cleaned_query + '*')
hits = self.searcher.search(parsed_query, limit=float(tool_search_limit), sortedby='')
Expand Down
2 changes: 1 addition & 1 deletion lib/galaxy/webapps/galaxy/config_schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1999,7 +1999,7 @@ mapping:
tool_enable_ngram_search:
type: bool
default: false
default: true
required: false
reloadable: true
desc: |
Expand Down

0 comments on commit 9d0e25b

Please sign in to comment.