-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gorule-0000057 filter lines by provided_by #1553
Comments
This looks correct. An alternative strategy would be to have MGI filter the file for only the annotations that will be imported, but it seems that having this general ability on the GOC end of things would be useful. |
@ukemi Yeah, upstream filtering would be a sure way of handling this. I just automatically started porting over the pre-existing filtering functionality from |
I agree with the upstream filtering, otherwise we need to restrict how Rule57 is applied, that seems to just move the problem elsewhere. |
Hi @dustine32 To make it easier (or even just possible) to find any references to rule, we've been rigorous about the format in tickets, please use gorule-nnnnnnn. Thanks, Pascale |
OK. @dustine32, we will create a GPAD2.0 file that only contains the annotations made by MGI curators using the MGI editorial interface. We will put it out on the test site for you. |
Hi @dustine32 What is the status of this? I suppose some version of this in done in the pipeline, but is not documented here: https://github.com/geneontology/go-site/blob/master/metadata/rules/gorule-0000057.md |
Noting that this rule is being reported in the reports: http://snapshot.geneontology.org/reports/assigned-by-gorule-report.html; based on Dustin's answer below: should we suppress this? or is this relevant for the production code and we should include tests? Other AI: clarify the formulation of the rule, mention 'filter_out' in the datasets.yaml files, and change status to implemented. |
Hi @dustine32 Is this specifically applying to imports, and how is this triggered? Thanks, Pascale |
@pgaudet Yep, this was proposed for the imports project but not needed. I'll close but feel free to reopen. |
Thanks, we'll just make sure to remove it from the reports (not sure why it's even coming up) |
There is a reports filter list (variable in the pipeline), if something needs to be disappeared. |
For the MOD imports project, one requirement is that we filter the MOD GPAD to keep only lines where the Provided_by column (aka Assigned_by) equals the MOD. So only Provided_by=MGI in
mgi.gpad
or Provided_by=WB inwb.gpad
. Provided_by=UniProt lines would be filtered out.We can handle this by expanding on the
filter_out
pattern currently existing in themgi.yaml
andwb.yaml
dataset files by adding a separatefilter_for
orfilter_in
(mayberequired_attributes
?) section:Accepting a list of
provided_by
values will allow flexibility if we later want to start importing some other non-MOD-source lines like UniProt. I'll update the datasets.schema.yaml, mgi.yaml and wb.yaml files in a test branch.Tagging @dougli1sqrd @kltm
The text was updated successfully, but these errors were encountered: