-
Notifications
You must be signed in to change notification settings - Fork 80
Redbiom initial code #2282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redbiom initial code #2282
Conversation
Codecov Report
@@ Coverage Diff @@
## dev #2282 +/- ##
==========================================
+ Coverage 93.78% 93.81% +0.03%
==========================================
Files 161 163 +2
Lines 18249 18427 +178
==========================================
+ Hits 17115 17288 +173
- Misses 1134 1139 +5
Continue to review full report at Codecov.
|
qiita_pet/handlers/qiita_redbiom.py
Outdated
|
||
if message == '': | ||
config = redbiom.get_config() | ||
get = redbiom._requests.make_get(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a private module, it should not be necessary to call this method
qiita_pet/handlers/qiita_redbiom.py
Outdated
'Not a valid search: "%s", your query is too small ' | ||
'(too few letters), try a longer query' % query) | ||
|
||
if message == '': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you describe in english what the goal of this block is? it looks like the intention is to establish a dict mapping a study, to the artifacts contained in that study that overlap with requested samples? if yes, wouldn't it be easier to just query the qiita API for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm also a bit confused as to why the artifact IDs are necessary? I'm concerned with the subsequent code as it depends on the structure of redbiom IDs which we do not make guarantees about. If this is necessary (which I think can be handled by cross referencing with the Qiita API), can you please open an issue so that we can construct an API endpoint to retrieve the tags from a set of redbiom IDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right! got it now, this was my first interaction with redbiom and you can tell. updating and simplifying, should also address the next comments.
qiita_pet/handlers/qiita_redbiom.py
Outdated
for idx in redbiom.util.ids_from(query, True, 'feature', | ||
ctx): | ||
aid, sid = idx.split('_', 1) | ||
sid = sid.split('.', 1)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sample IDs in qiita are not prefixed with the qiita study id?
qiita_pet/handlers/qiita_redbiom.py
Outdated
ctx): | ||
aid, sid = idx.split('_', 1) | ||
sid = sid.split('.', 1)[0] | ||
study_artifacts[sid].append(aid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, understanding this more. An API method should be exposed in redbiom which yields a the set of observed tags for a given iterable of redbiom IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, but for the time being, this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wasade, thanks!
qiita_pet/handlers/qiita_redbiom.py
Outdated
'Not a valid search: "%s", your query is too small ' | ||
'(too few letters), try a longer query' % query) | ||
|
||
if message == '': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right! got it now, this was my first interaction with redbiom and you can tell. updating and simplifying, should also address the next comments.
qiita_pet/handlers/qiita_redbiom.py
Outdated
ctx): | ||
aid, sid = idx.split('_', 1) | ||
sid = sid.split('.', 1)[0] | ||
study_artifacts[sid].append(aid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, but for the time being, this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments
qiita_db/util.py
Outdated
del info["pi_email"] | ||
del info["pi_name"] | ||
|
||
infolist.append({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you do this:
info['status'] = qdb.study.Study(info['study_id']).status
infolist.append(info)
you get the same result right? It looks like you're creating a dictionary by using the same key values that info
has. If the problem is that info
has more keys than you need, I think the solution is to delete those keys (as you are doing a couple of lines above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
qiita_pet/handlers/qiita_redbiom.py
Outdated
message = ( | ||
'Not a valid search: "%s", are you sure this is a ' | ||
'valid metadata %s?' % ( | ||
query, 'value' if search_on == 'metadata' else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if on this line is redundant (you are sure that search_on == 'metadata'
-> see line 43
qiita_pet/handlers/qiita_redbiom.py
Outdated
message = ( | ||
'Not a valid search: "%s", your query is too small ' | ||
'(too few letters), try a longer query' % query) | ||
if message == '': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use if not message
like in line 40, for consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, you can use try/except/else
:
In [1]: def f(s):
...: try:
...: if s == 'a':
...: raise ValueError('Error!')
...: except ValueError:
...: print "Error happened"
...: else:
...: print "No error"
...:
In [2]: f('a')
Error happened
In [3]: f('b')
No error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
qiita_pet/handlers/qiita_redbiom.py
Outdated
self.render('redbiom.html') | ||
|
||
@execute_as_transaction | ||
def _redbiom_search(self, query, search_on, callback): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some trouble following this function, basically because this function is performing 3 different operations, depending on the value of search_on
. I would create smaller functions that perform the single operation, so the code is easier to read. Something along this lines:
def _redbiom_metadata_search(self, query, contexts):
...
return message, study_artifacts
def _redbiom_feature_search(self, query, contexts):
...
return message, study_artifacts
def _redbiom_taxon_search(self, query, contexts):
...
return message, study_artifacts
Then, this function can be simplified as shown below. I know the function is incomplete here but I think this exemplifies the structure that I'm proposing:
def _redbiom_search(self, query, search_on, callback):
search_f = {'metadata': self._redbiom_metadata_search,
'feature': _redbiom_feature_search,
'taxon': _redbiom_taxon_search}
try:
df = readbiom.summarize.contexts()
except ConnectionError:
message = '...'
else:
contexts = df.ContextName.values
message, study_artifacts = search_f(query, contexts)
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
qiita_pet/handlers/qiita_redbiom.py
Outdated
message = ('Incorrect search by: you can use metadata, ' | ||
'features or taxon and you passed: %s' % search_on) | ||
|
||
if message == '': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use if not message
qiita_pet/handlers/qiita_redbiom.py
Outdated
@coroutine | ||
@execute_as_transaction | ||
def post(self, search): | ||
search = self.get_argument('search', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like search
and search_on
are not really optional parameters.
I think the correct thing to do here is:
search = self.get_argument('search')
search_on = self.get_argument('search_on')
and let tornado handle if the arguments are not passed. It will return a 400 error message which is the correct code given that the request is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
@@ -115,6 +140,36 @@ function show_hide_process_list() { | |||
} | |||
} | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation doesn't fit function signature. Can you update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
$(button).addClass("btn-info"); | ||
bootstrapAlert('We are adding ' + aids.length + ' artifact(s) to the analysis. This ' + | ||
'might take some time based on the number of samples on each artifact.', "warning", 10000); | ||
$.get('/artifact/samples/', {ids:aids}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Samples are added using a get
? That is confusing... get
should be use for read-only operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not actually adding the samples, it's requesting the sample names from the artifacts to add. The addition is actually done in this line: qiita_websocket.send('sel', data['data']);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
qiita_pet/templates/redbiom.html
Outdated
{% if current_user is not None %} | ||
return '<div class="container" style="max-width: 5em;">'+ | ||
'<div class="row justify-content-md-center">' + | ||
'<div class="col-md-1 text-center details-control"> </div>' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these two blocks are the same except for this line? You can then do:
return '<div class="container" style="max-width: 5em;">'+
'<div class="row justify-content-md-center">' +
{% if current_user is not None %}
'<div class="col-md-1 text-center details-control"> </div>' +
{% end %}
'<div class="col-md-1 text-center">' + data.length + '</div>' +
'</div>' +
'</div>';
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
|
||
{%block content%} | ||
<small> | ||
Redbiom only searches on public data and the data is updated nightly. Note that you will only be able to expand and add artifacts to analyses if you are signed into Qiita. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this accurate? @wasade will it be possible to update the redis db every night?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have not touched load mechanisms since June, and there is work to do there, but it should be feasible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's a discussion for next week's meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @josenavas
qiita_db/util.py
Outdated
del info["pi_email"] | ||
del info["pi_name"] | ||
|
||
infolist.append({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
qiita_pet/handlers/qiita_redbiom.py
Outdated
self.render('redbiom.html') | ||
|
||
@execute_as_transaction | ||
def _redbiom_search(self, query, search_on, callback): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
qiita_pet/handlers/qiita_redbiom.py
Outdated
message = ( | ||
'Not a valid search: "%s", your query is too small ' | ||
'(too few letters), try a longer query' % query) | ||
if message == '': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
qiita_pet/handlers/qiita_redbiom.py
Outdated
@coroutine | ||
@execute_as_transaction | ||
def post(self, search): | ||
search = self.get_argument('search', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
@@ -115,6 +140,36 @@ function show_hide_process_list() { | |||
} | |||
} | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
$(button).addClass("btn-info"); | ||
bootstrapAlert('We are adding ' + aids.length + ' artifact(s) to the analysis. This ' + | ||
'might take some time based on the number of samples on each artifact.', "warning", 10000); | ||
$.get('/artifact/samples/', {ids:aids}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not actually adding the samples, it's requesting the sample names from the artifacts to add. The addition is actually done in this line: qiita_websocket.send('sel', data['data']);
qiita_pet/templates/redbiom.html
Outdated
{% if current_user is not None %} | ||
return '<div class="container" style="max-width: 5em;">'+ | ||
'<div class="row justify-content-md-center">' + | ||
'<div class="col-md-1 text-center details-control"> </div>' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
|
||
{%block content%} | ||
<small> | ||
Redbiom only searches on public data and the data is updated nightly. Note that you will only be able to expand and add artifacts to analyses if you are signed into Qiita. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's a discussion for next week's meeting.
$(button).addClass("btn-info"); | ||
bootstrapAlert('We are adding ' + aids.length + ' artifact(s) to the analysis. This ' + | ||
'might take some time based on the number of samples on each artifact.', "warning", 10000); | ||
$.get('/artifact/samples/', {ids:aids}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
#2278 needs to be merged first.
Extra from adding redbiom: