Skip to content

Analysis refactor gui Part 2 #2068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
3601c29
fix #1505
antgonza Jan 2, 2017
0d6788e
improving some GUI stuff
antgonza Jan 3, 2017
12406cc
improving some GUI stuff - missing lines
antgonza Jan 3, 2017
958fcbe
pull upstream master
antgonza Jan 4, 2017
a57ef23
addressing all comments
antgonza Jan 5, 2017
2ead7a6
ready for review
antgonza Jan 5, 2017
931e566
New DB structure
josenavas Jan 9, 2017
a2b883e
Adding python patch
josenavas Jan 9, 2017
e737f64
Adding a biom so we can actually execute the patch
josenavas Jan 9, 2017
6bc0303
Fixing the patch to correctly transfer the information from the old s…
josenavas Jan 12, 2017
331012d
Fixing patch
josenavas Jan 12, 2017
eabe25d
Fixing patch and a few other bits to make the patch run successfully
josenavas Jan 12, 2017
0c40bc0
These files are no longer needed
josenavas Jan 12, 2017
602acc3
Removing unused code
josenavas Jan 12, 2017
ad4deea
Droping analysis status table
josenavas Jan 12, 2017
aa3e96f
Linking the analysis with all the artifacts
josenavas Jan 12, 2017
d222e08
Fixing typo
josenavas Jan 12, 2017
14be3b0
Merge branch 'analysis-refactor-db' into analysis-refactor-fix-analysis
josenavas Jan 12, 2017
39f6beb
Fixing HTML and dbschema files
josenavas Jan 12, 2017
8c2be56
Merge branch 'analysis-refactor-db' into analysis-refactor-fix-analysis
josenavas Jan 12, 2017
4dd357c
Adding analyisis jobs
josenavas Jan 13, 2017
209e838
Merge branch 'analysis-refactor-db' into analysis-refactor-fix-analysis
josenavas Jan 13, 2017
b24833d
Extending the artifact to work with the analysis
josenavas Jan 13, 2017
8b7f222
Allowing multiomics datatype
josenavas Jan 13, 2017
b381993
Adding private_job_submitter and modifying proc job handler to use it
josenavas Jan 13, 2017
ab65fa2
Adding logging column to the analysis
josenavas Jan 13, 2017
40656f2
Merge branch 'analysis-refactor-db' into analysis-refactor-fix-analysis
josenavas Jan 13, 2017
c4c8420
Adding datatype to the analysis-processing job table
josenavas Jan 15, 2017
0cd9f27
Adding REST endpoint to access the analysis metadata
josenavas Jan 15, 2017
8207513
Adding private jobs to plugin
josenavas Jan 15, 2017
f307cd4
Fixing typo
josenavas Jan 15, 2017
78c8095
Fixing analysis
josenavas Jan 16, 2017
b3936c8
Fixing the processing jobs complete
josenavas Jan 16, 2017
19c3f7d
Removing the old job code
josenavas Jan 16, 2017
055b53e
Oops removed the wrong file
josenavas Jan 16, 2017
46c322d
Removing QiitaStatusObject because it is not used
josenavas Jan 16, 2017
4d2e14d
fixing metautil
josenavas Jan 16, 2017
befb4e2
Fixing porntal, setup and sql tests
josenavas Jan 16, 2017
09f2722
Fixing user and util
josenavas Jan 16, 2017
d1cdc8e
Fixing qiita_db
josenavas Jan 16, 2017
73a78e7
fix #1987
antgonza Jan 16, 2017
e64a22a
Merge pull request #2036 from antgonza/fix-1505
josenavas Jan 16, 2017
099b575
Merge branch 'master' of https://github.com/biocore/qiita into analys…
josenavas Jan 16, 2017
681e1cf
Updating to master
josenavas Jan 16, 2017
43eb1ed
Addressing @antgonza's comments
josenavas Jan 16, 2017
0807bf0
Removing unused code from qiita_ware
josenavas Jan 16, 2017
05cf8f2
Modify analysis list
josenavas Jan 16, 2017
8ddaa89
xfer analysis list and deletion
josenavas Jan 16, 2017
b879bc8
Adding Analysis summary ajax handler
josenavas Jan 17, 2017
736c2b7
Adding selected samples handler
josenavas Jan 17, 2017
0dcae8b
Merge pull request #2047 from antgonza/fix-1987
josenavas Jan 17, 2017
4a5bbbc
initial commit
antgonza Jan 18, 2017
f99975c
requested changes
antgonza Jan 18, 2017
ed899a8
Merge pull request #2049 from antgonza/add-processing-suggestions
josenavas Jan 18, 2017
d508320
fix filter job list
antgonza Jan 18, 2017
025cc1e
Merge pull request #2050 from antgonza/fix-filter-job-list
josenavas Jan 18, 2017
599bcde
Fixing server cert (#2051)
josenavas Jan 19, 2017
d12ccfe
fix get_studies
antgonza Jan 20, 2017
b33983b
flake8
antgonza Jan 20, 2017
b4f1b1f
fix #503
antgonza Jan 20, 2017
62a1b93
fix #2010
antgonza Jan 20, 2017
2e36141
fix #1913
antgonza Jan 21, 2017
e006e20
fix errors
antgonza Jan 21, 2017
c174693
Merge pull request #2052 from antgonza/fix-get_studies
josenavas Jan 23, 2017
131dd6a
Merge pull request #2053 from antgonza/fix-by-blinking
josenavas Jan 23, 2017
cbdd602
Adding util
josenavas Jan 23, 2017
ccb55bd
addressing @josenavas comment
antgonza Jan 24, 2017
dfe2e83
flake8
antgonza Jan 24, 2017
15fcceb
Merge pull request #2056 from antgonza/fix-1913
josenavas Jan 24, 2017
7529c3d
Adding analysis listing and analysis description handlers
josenavas Jan 24, 2017
9eb9dbb
fix #1066 (#2058)
antgonza Jan 26, 2017
ecacdbc
Removing debug prints
josenavas Jan 26, 2017
1f1e826
fix #1961
antgonza Jan 27, 2017
19889f9
Automatic jobs & new stats (#2057)
antgonza Jan 27, 2017
4e380e0
Merge pull request #2060 from antgonza/fix-1961
wasade Jan 28, 2017
6246eea
Fixing creation url and job html creation
josenavas Jan 29, 2017
d016833
Taking into account non-phylogenetic metrics in beta diversity
josenavas Jan 30, 2017
8d1f0c2
Merge branch 'analysis-refactor-fix-analysis' of github.com:josenavas…
josenavas Jan 30, 2017
894379c
Fixing merge conflicts
josenavas Jan 30, 2017
a80c0ac
Addressing @antgonza's comments
josenavas Feb 1, 2017
229d5e2
Fix typo
josenavas Feb 1, 2017
be0292c
Make the interface reactive for when the initial artifacts are created
josenavas Feb 1, 2017
5d63afa
Merge branch 'analysis-refactor-fix-analysis' into analysis-refactor-…
josenavas Feb 1, 2017
c571bee
Merge branch 'analysis-refactor-remove-old' into analysis-refactor-qi…
josenavas Feb 1, 2017
005fc44
Merge branch 'analysis-refactor-qiita-db' into analysis-refactor-master
josenavas Feb 1, 2017
a4fceb7
Fixing merge conflicts
josenavas Feb 1, 2017
cf4e4b2
Merge branch 'analysis-refactor-master' into analysis-refactor-gui
josenavas Feb 1, 2017
191b7ce
Merge branch 'analysis-refactor-gui' into analysis-refactor-gui-2
josenavas Feb 1, 2017
64419c8
Using vue.js to update the interface automatically
josenavas Feb 1, 2017
2d1e1bb
Improving documentation
josenavas Feb 1, 2017
05e27a9
Not solving merge conflicts cause this is going to die
josenavas Feb 9, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions qiita_db/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,8 +212,12 @@ def delete(cls, _id):
qdb.sql_connection.TRN.add(sql, [_id])
if qdb.sql_connection.TRN.execute_fetchlast():
raise qdb.exceptions.QiitaDBOperationNotPermittedError(
<<<<<<< HEAD
"Can't delete analysis %d, has artifacts attached")
=======
"Can't delete analysis %d, has artifacts attached"
% _id)
>>>>>>> ee170a08ec44fceb6c20b278279b8ce4b3d10a89

sql = "DELETE FROM qiita.analysis_filepath WHERE {0} = %s".format(
cls._analysis_id_column)
Expand Down Expand Up @@ -514,6 +518,10 @@ def pmid(self, pmid):
qdb.sql_connection.TRN.add(sql, [pmid, self._id])
qdb.sql_connection.TRN.execute()

<<<<<<< HEAD
# ---- Functions ----
=======
>>>>>>> ee170a08ec44fceb6c20b278279b8ce4b3d10a89
def add_artifact(self, artifact):
"""Adds an artifact to the analysis

Expand Down Expand Up @@ -776,9 +784,15 @@ def build_files(self, merge_duplicated_sample_ids):
self._build_mapping_file(samples, rename_dup_samples)
biom_files = self._build_biom_tables(
grouped_samples, rename_dup_samples)
<<<<<<< HEAD

return biom_files

=======

return biom_files

>>>>>>> ee170a08ec44fceb6c20b278279b8ce4b3d10a89
def _build_biom_tables(self, grouped_samples, rename_dup_samples=False):
"""Build tables and add them to the analysis"""
with qdb.sql_connection.TRN:
Expand Down
23 changes: 23 additions & 0 deletions qiita_db/artifact.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from itertools import chain
from datetime import datetime
from os import remove
from collections import Counter

import networkx as nx

Expand Down Expand Up @@ -284,10 +285,17 @@ def create(cls, filepaths, artifact_type, name=None, prep_template=None,
"at least one filepath is required.")

# Check that the combination of parameters is correct
<<<<<<< HEAD
counts = Counter([bool(parents or processing_parameters),
prep_template is not None,
bool(analysis or data_type)])
if counts[True] != 1:
=======
counts = (int(bool(parents or processing_parameters)) +
int(prep_template is not None) +
int(bool(analysis or data_type)))
if counts != 1:
>>>>>>> ee170a08ec44fceb6c20b278279b8ce4b3d10a89
# More than one parameter has been provided
raise qdb.exceptions.QiitaDBArtifactCreationError(
"One and only one of parents, prep template or analysis must "
Expand Down Expand Up @@ -363,12 +371,24 @@ def _associate_with_analysis(instance, analysis_id):
# If an artifact has parents, it can be either from the
# processing pipeline or the analysis pipeline. Decide which
# one here
<<<<<<< HEAD
studies = set()
analyses = set()
for p in parents:
s = p.study
a = p.analysis
if s is not None:
studies.add(s.id)
if a is not None:
analyses.add(a.id)
=======
studies = {p.study for p in parents}
analyses = {p.analysis for p in parents}
studies.discard(None)
analyses.discard(None)
studies = {s.id for s in studies}
analyses = {a.id for a in analyses}
>>>>>>> ee170a08ec44fceb6c20b278279b8ce4b3d10a89

# The first 2 cases should never happen, but it doesn't hurt
# to check them
Expand Down Expand Up @@ -429,7 +449,10 @@ def _associate_with_analysis(instance, analysis_id):
instance.visibility = 'private'
else:
instance.visibility = 'public'
<<<<<<< HEAD
=======

>>>>>>> ee170a08ec44fceb6c20b278279b8ce4b3d10a89
elif prep_template:
# This artifact is uploaded by the user in the
# processing pipeline
Expand Down
165 changes: 165 additions & 0 deletions qiita_db/meta_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,171 @@ def validate_filepath_access_by_user(user, filepath_id):
return False


def update_redis_stats():
"""Generate the system stats and save them in redis

<<<<<<< HEAD
# Then add the filepaths of the sample template
study = artifact.study
if study:
filepath_ids.update(
{fid
for fid, _ in study.sample_template.get_filepaths()})

# Next, analyses
# Same as before, there are public, private, and shared
analyses = qdb.analysis.Analysis.get_by_status('public') | \
user.private_analyses | user.shared_analyses

if analyses:
sql = """SELECT filepath_id
FROM qiita.analysis_filepath
WHERE analysis_id IN %s"""
sql_args = tuple([a.id for a in analyses])
qdb.sql_connection.TRN.add(sql, [sql_args])
filepath_ids.update(qdb.sql_connection.TRN.execute_fetchflatten())

return filepath_ids
=======
Returns
-------
list of str
artifact filepaths that are not present in the file system
"""
STUDY = qdb.study.Study
studies = {'public': STUDY.get_by_status('private'),
'private': STUDY.get_by_status('public'),
'sanbox': STUDY.get_by_status('sandbox')}
number_studies = {k: len(v) for k, v in viewitems(studies)}

number_of_samples = {}
ebi_samples_prep = {}
num_samples_ebi = 0
for k, sts in viewitems(studies):
number_of_samples[k] = 0
for s in sts:
st = s.sample_template
if st is not None:
number_of_samples[k] += len(list(st.keys()))

ebi_samples_prep_count = 0
for pt in s.prep_templates():
ebi_samples_prep_count += len([
1 for _, v in viewitems(pt.ebi_experiment_accessions)
if v is not None and v != ''])
ebi_samples_prep[s.id] = ebi_samples_prep_count

if s.sample_template is not None:
num_samples_ebi += len([
1 for _, v in viewitems(
s.sample_template.ebi_sample_accessions)
if v is not None and v != ''])

num_users = qdb.util.get_count('qiita.qiita_user')

lat_longs = get_lat_longs()

num_studies_ebi = len(ebi_samples_prep)
number_samples_ebi_prep = sum([v for _, v in viewitems(ebi_samples_prep)])

# generating file size stats
stats = []
missing_files = []
for k, sts in viewitems(studies):
for s in sts:
for a in s.artifacts():
for _, fp, dt in a.filepaths:
try:
s = stat(fp)
stats.append((dt, s.st_size, strftime('%Y-%m',
localtime(s.st_ctime))))
except OSError:
missing_files.append(fp)

summary = {}
all_dates = []
for ft, size, ym in stats:
if ft not in summary:
summary[ft] = {}
if ym not in summary[ft]:
summary[ft][ym] = 0
all_dates.append(ym)
summary[ft][ym] += size
all_dates = sorted(set(all_dates))

# sorting summaries
rm_from_data = ['html_summary', 'tgz', 'directory', 'raw_fasta', 'log',
'biom', 'raw_sff', 'raw_qual']
ordered_summary = {}
for dt in summary:
if dt in rm_from_data:
continue
new_list = []
current_value = 0
for ad in all_dates:
if ad in summary[dt]:
current_value += summary[dt][ad]
new_list.append(current_value)
ordered_summary[dt] = new_list

plot_order = sorted([(k, ordered_summary[k][-1]) for k in ordered_summary],
key=lambda x: x[1])

# helper function to generate y axis, modified from:
# http://stackoverflow.com/a/1094933
def sizeof_fmt(value, position):
number = None
for unit in ['', 'K', 'M', 'G', 'T', 'P', 'E', 'Z']:
if abs(value) < 1024.0:
number = "%3.1f%s" % (value, unit)
break
value /= 1024.0
if number is None:
number = "%.1f%s" % (value, 'Yi')
return number

all_dates_axis = range(len(all_dates))
plt.locator_params(axis='y', nbins=10)
plt.figure(figsize=(20, 10))
for k, v in plot_order:
plt.plot(all_dates_axis, ordered_summary[k], linewidth=2, label=k)

plt.xticks(all_dates_axis, all_dates)
plt.legend()
plt.grid()
ax = plt.gca()
ax.yaxis.set_major_formatter(mpl.ticker.FuncFormatter(sizeof_fmt))
plt.xlabel('Date')
plt.ylabel('Storage space per data type')

plot = StringIO()
plt.savefig(plot, format='png')
plot.seek(0)
img = 'data:image/png;base64,' + quote(b64encode(plot.buf))

time = datetime.now().strftime('%m-%d-%y %H:%M:%S')

portal = qiita_config.portal
vals = [
('number_studies', number_studies, r_client.hmset),
('number_of_samples', number_of_samples, r_client.hmset),
('num_users', num_users, r_client.set),
('lat_longs', lat_longs, r_client.set),
('num_studies_ebi', num_studies_ebi, r_client.set),
('num_samples_ebi', num_samples_ebi, r_client.set),
('number_samples_ebi_prep', number_samples_ebi_prep, r_client.set),
('img', img, r_client.set),
('time', time, r_client.set)]
for k, v, f in vals:
redis_key = '%s:stats:%s' % (portal, k)
# important to "flush" variables to avoid errors
r_client.delete(redis_key)
f(redis_key, v)

return missing_files
>>>>>>> ee170a08ec44fceb6c20b278279b8ce4b3d10a89


def update_redis_stats():
"""Generate the system stats and save them in redis

Expand Down
118 changes: 118 additions & 0 deletions qiita_db/support_files/patches/50.sql
Original file line number Diff line number Diff line change
@@ -1,3 +1,120 @@
<<<<<<< HEAD
-- Jan 5, 2017
-- Move the analysis to the plugin system. This is a major rewrite of the
-- database backend that supports the analysis pipeline.
-- After exploring the data on the database, we realized that
-- there are a lot of inconsistencies in the data. Unfortunately, this
-- makes the process of transferring the data from the old structure
-- to the new one a bit more challenging, as we will need to handle
-- different special cases. Furthermore, all the information needed is not
-- present in the database, since it requires checking BIOM files. Due to these
-- reason, the vast majority of the data transfer is done in the python patch
-- 47.py

-- In this file we are just creating the new data structures. The old
-- datastructure will be dropped in the python patch once all data has been
-- transferred.

-- Create the new data structures

-- Table that links the analysis with the initial set of artifacts
CREATE TABLE qiita.analysis_artifact (
analysis_id bigint NOT NULL,
artifact_id bigint NOT NULL,
CONSTRAINT idx_analysis_artifact_0 PRIMARY KEY (analysis_id, artifact_id)
);
CREATE INDEX idx_analysis_artifact_analysis ON qiita.analysis_artifact (analysis_id);
CREATE INDEX idx_analysis_artifact_artifact ON qiita.analysis_artifact (artifact_id);
ALTER TABLE qiita.analysis_artifact ADD CONSTRAINT fk_analysis_artifact_analysis FOREIGN KEY ( analysis_id ) REFERENCES qiita.analysis( analysis_id );
ALTER TABLE qiita.analysis_artifact ADD CONSTRAINT fk_analysis_artifact_artifact FOREIGN KEY ( artifact_id ) REFERENCES qiita.artifact( artifact_id );

-- Droping the analysis status column cause now it depends on the artifacts
-- status, like the study does.
ALTER TABLE qiita.analysis DROP COLUMN analysis_status_id;

-- Create a table to link the analysis with the jobs that create the initial
-- artifacts
CREATE TABLE qiita.analysis_processing_job (
analysis_id bigint NOT NULL,
processing_job_id uuid NOT NULL,
CONSTRAINT idx_analysis_processing_job PRIMARY KEY ( analysis_id, processing_job_id )
) ;

CREATE INDEX idx_analysis_processing_job_analysis ON qiita.analysis_processing_job ( analysis_id ) ;
CREATE INDEX idx_analysis_processing_job_pj ON qiita.analysis_processing_job ( processing_job_id ) ;
ALTER TABLE qiita.analysis_processing_job ADD CONSTRAINT fk_analysis_processing_job FOREIGN KEY ( analysis_id ) REFERENCES qiita.analysis( analysis_id ) ;
ALTER TABLE qiita.analysis_processing_job ADD CONSTRAINT fk_analysis_processing_job_pj FOREIGN KEY ( processing_job_id ) REFERENCES qiita.processing_job( processing_job_id ) ;

-- Add a logging column in the analysis
ALTER TABLE qiita.analysis ADD logging_id bigint ;
CREATE INDEX idx_analysis_0 ON qiita.analysis ( logging_id ) ;
ALTER TABLE qiita.analysis ADD CONSTRAINT fk_analysis_logging FOREIGN KEY ( logging_id ) REFERENCES qiita.logging( logging_id ) ;

-- We can handle some of the special cases here, so we simplify the work in the
-- python patch

-- Special case 1: there are jobs in the database that do not contain
-- any information about the options used to process those parameters.
-- However, these jobs do not have any results and all are marked either
-- as queued or error, although no error log has been saved. Since these
-- jobs are mainly useleess, we are going to remove them from the system
DELETE FROM qiita.analysis_job
WHERE job_id IN (SELECT job_id FROM qiita.job WHERE options = '{}');
DELETE FROM qiita.job WHERE options = '{}';

-- Special case 2: there are a fair amount of jobs (719 last time I
-- checked) that are not attached to any analysis. Not sure how this
-- can happen, but these orphan jobs can't be accessed from anywhere
-- in the interface. Remove them from the system. Note that we are
-- unlinking the files but we are not removing them from the filepath
-- table. We will do that on the patch 47.py using the
-- purge_filepaths function, as it will make sure that those files are
-- not used anywhere else
DELETE FROM qiita.job_results_filepath WHERE job_id IN (
SELECT job_id FROM qiita.job J WHERE NOT EXISTS (
SELECT * FROM qiita.analysis_job AJ WHERE J.job_id = AJ.job_id));
DELETE FROM qiita.job J WHERE NOT EXISTS (
SELECT * FROM qiita.analysis_job AJ WHERE J.job_id = AJ.job_id);

-- In the analysis pipeline, an artifact can have mutliple datatypes
-- (e.g. procrustes). Allow this by creating a new data_type being "multiomic"
INSERT INTO qiita.data_type (data_type) VALUES ('Multiomic');


-- The valdiate command from BIOM will have an extra parameter, analysis
-- Magic number -> 4 BIOM command_id -> known for sure since it was added in
-- patch 36.sql
INSERT INTO qiita.command_parameter (command_id, parameter_name, parameter_type, required)
VALUES (4, 'analysis', 'analysis', FALSE);
-- The template comand now becomes optional, since it can be added either to
-- an analysis or to a prep template. command_parameter_id known from patch
-- 36.sql
UPDATE qiita.command_parameter SET required = FALSE WHERE command_parameter_id = 34;

-- We are going to add a new special software type, and a new software.
-- This is going to be used internally by Qiita, so submit the private jobs.
-- This is needed for the analysis.
INSERT INTO qiita.software_type (software_type, description)
VALUES ('private', 'Internal Qiita jobs');

DO $do$
DECLARE
qiita_sw_id bigint;
baf_cmd_id bigint;
BEGIN
INSERT INTO qiita.software (name, version, description, environment_script, start_script, software_type_id, active)
VALUES ('Qiita', 'alpha', 'Internal Qiita jobs', 'source activate qiita', 'qiita-private-2', 3, True)
RETURNING software_id INTO qiita_sw_id;

INSERT INTO qiita.software_command (software_id, name, description)
VALUES (qiita_sw_id, 'build_analysis_files', 'Builds the files needed for the analysis')
RETURNING command_id INTO baf_cmd_id;

INSERT INTO qiita.command_parameter (command_id, parameter_name, parameter_type, required, default_value)
VALUES (baf_cmd_id, 'analysis', 'analysis', True, NULL),
(baf_cmd_id, 'merge_dup_sample_ids', 'bool', False, 'False');
END $do$
=======
-- Feb 3, 2017
-- adding study tagging system

Expand All @@ -17,3 +134,4 @@ CREATE TABLE qiita.per_study_tags (
study_id bigint NOT NULL,
CONSTRAINT pk_per_study_tags PRIMARY KEY ( study_tag_id, study_id )
) ;
>>>>>>> ee170a08ec44fceb6c20b278279b8ce4b3d10a89
Loading