Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FeatureData[Taxonomy] discrepancies between Python and CLI #136

Open
ElDeveloper opened this issue Sep 2, 2017 · 6 comments
Open

FeatureData[Taxonomy] discrepancies between Python and CLI #136

ElDeveloper opened this issue Sep 2, 2017 · 6 comments

Comments

@ElDeveloper
Copy link
Member

When I filter features based on taxonomy information using the Python API, I get the following error:

ft.methods.filter_features(q2.Artifact.load('feature-table.even.24973.qza'),
                           metadata=q2.Artifact.load('taxonomy.qza'),
                           where='Taxon LIKE "%Acinetobacter%"')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-91-c141b21f0f78> in <module>()
      1 ft.methods.filter_features(q2.Artifact.load('feature-table.even.24973.qza'),
      2                            metadata=q2.Artifact.load('taxonomy.qza'),
----> 3                            where='Taxon LIKE "%Acinetobacter%"')

<decorator-gen-236> in filter_features(table, min_frequency, max_frequency, min_samples, max_samples, metadata, where, exclude_ids)

~/miniconda/envs/qiime2-2017.8/lib/python3.5/site-packages/qiime2/sdk/action.py in callable_wrapper(*args, **kwargs)
    169             user_input.update(kwargs)
    170 
--> 171             self.signature.check_types(**user_input)
    172             output_types = self.signature.solve_output(**user_input)
    173 

~/miniconda/envs/qiime2-2017.8/lib/python3.5/site-packages/qiime2/core/type/signature.py in check_types(self, **kwargs)
    283                         kwargs[name] is None):
    284                     raise TypeError("Argument to parameter %r is not a "
--> 285                                     "subtype of %r." % (name, spec.qiime_type))
    286 
    287     def solve_output(self, **input_types):

TypeError: Argument to parameter 'metadata' is not a subtype of Metadata.

Which makes sense because the artifact is not a subtype of metadata. Turns out I have to do view the artifact as Metadata in order for it to work as expected:

ft.methods.filter_features(q2.Artifact.load('feature-table.even.24973.qza'),
                           metadata=q2.Artifact.load('taxonomy.qza').view(q2.Metadata),
                           where='Taxon LIKE "%Acinetobacter%"')
Results (name = value)
-----------------------------------------------------------------------------------------------
filtered_table = <artifact: FeatureTable[Frequency] uuid: 639248d3-1160-4ebe-8062-fae85d24dbdf>

The main discrepancy comes from using this same action via the CLI, which works fine without any conversions etc:

qiime feature-table filter-features \
--i-table feature-table.even.24973.qza \
--m-metadata-file 'taxonomy.qza' \
 --p-where 'Taxon LIKE "%Acinetobacter%"' --o-filtered-table 'taxonomy.acinetobacter.qza'
Saved FeatureTable[Frequency] to: taxonomy.acinetobacter.qza
@ebolyen
Copy link
Member

ebolyen commented Sep 5, 2017

One of the reasons we didn't implement this initially is that metadata merging is also supported, and so we would either need to accept a tuple of metadata-like objects, or just stick with metadata. It is the current opinion of the framework that metadata is a primitive and so it is on the interface to get that from the user in some way.

@ElDeveloper
Copy link
Member Author

Fair enough, it was just confusing to see that this happened. Also, there were no examples in the documentation about this, so it took some fiddling to figure this out.

@jairideout
Copy link
Member

Turns out I have to do view the artifact as Metadata in order for it to work as expected:

ft.methods.filter_features(q2.Artifact.load('feature-table.even.24973.qza'),
                           metadata=q2.Artifact.load('taxonomy.qza').view(q2.Metadata),
                           where='Taxon LIKE "%Acinetobacter%"')
Results (name = value)
-----------------------------------------------------------------------------------------------
filtered_table = <artifact: FeatureTable[Frequency] uuid: 639248d3-1160-4ebe-8062-fae85d24dbdf>

Your code works but unfortunately calling Aritfact.view(Metadata) doesn't preserve provenance of the artifact. To do that you'll need to use Metadata.from_artifact, which is an alternate constructor.

I've messed this up before previously, and also ran into the case @ElDeveloper was originally reporting where I've expected the Artifact API to transparently turn an Artifact into Metadata if the operation is supported.

Perhaps Artifact.view(Metadata) should also preserve provenance (i.e. become equivalent to Metadata.from_artifact)? We could also support "auto-viewing" an Artifact as Metadata to address @ElDeveloper's issue. By implementing these two strategies, we'd continue to support the more advanced merging cases @ebolyen noted, while having sane default behavior for simpler cases.

@ElDeveloper
Copy link
Member Author

We could also support "auto-viewing" an Artifact as Metadata to address @ElDeveloper's issue.

This is really what I was expecting. I hadn't even considered that provenance would not be preserved, this probably warrants a warning(?) though it sounds a bit tough to figure out when to raise and when not to raise a warning.

@jairideout
Copy link
Member

I hadn't even considered that provenance would not be preserved, this probably warrants a warning(?) though it sounds a bit tough to figure out when to raise and when not to raise a warning.

A warning would be better than the current behavior, but ideally you'd be able to call my_artifact.view(Metadata) or Metadata.from_artifact(my_artifact) and have the provenance preserved in either case (I'd like to see those two operations become equivalent if possible).

jairideout added a commit to jairideout/qiime2 that referenced this issue Jan 25, 2018
Removed `Metadata.from_artifact` in favor of `Artifact.view(Metadata)`, which now tracks the source artifact (important for provenance).

Discussed with @ebolyen; there's no need to have two APIs that do the same thing, and using the view API is preferred since `Metadata` doesn't need to be special-cased.

Related to discussion in qiime2/q2-feature-table#136 regarding `Artifact.view(Metadata)` not tracking the source artifact, which is now fixed.
thermokarst pushed a commit to qiime2/qiime2 that referenced this issue Jan 26, 2018
#366)

Removed `Metadata.from_artifact` in favor of `Artifact.view(Metadata)`, which now tracks the source artifact (important for provenance).

Discussed with @ebolyen; there's no need to have two APIs that do the same thing, and using the view API is preferred since `Metadata` doesn't need to be special-cased.

Related to discussion in qiime2/q2-feature-table#136 regarding `Artifact.view(Metadata)` not tracking the source artifact, which is now fixed.
@jairideout
Copy link
Member

Regarding Artifact.view(Metadata), that method now tracks the source artifact for provenance. Metadata.from_artifact has been removed in favor of that API since they are now equivalent and we don't need both. See qiime2/qiime2#366 for details.

I'm leaving this issue open because the original topic was about having Artifact objects being automatically converted to Metadata objects when using the Artifact API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants