-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NEW: Adds core beta diversity measures #6
Changes from all commits
06d4d61
ef17a8b
4da3de4
b294120
5ac73ad
7205ed2
fb4e59a
9d04464
d9b9b91
a390a5f
7db2de7
1b75089
aef7749
4e81604
cdcda1c
ee5eaf2
bec18de
2b1ad18
1ebe57b
96e4388
e244509
27df8f0
fd7c695
2bd531a
9744ccd
e8c74f2
d59aab3
e06eb57
e85651e
c4c91e5
acc4daa
c33b977
0e3ec5e
ab22b7f
a96c04d
229e79f
ad5967b
ae674f3
fd0e24e
47daa08
750836d
6fcd5b7
f3e9056
687401d
d5fdd8e
b9f5168
b662efd
226be5b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
name: lint-build-test | ||
# build on every PR and commit to master | ||
on: | ||
pull_request: | ||
push: | ||
branches: | ||
- master | ||
|
||
jobs: | ||
lint: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: set up python 3.6 | ||
uses: actions/setup-python@v1 | ||
with: | ||
python-version: 3.6 | ||
- name: install dependencies | ||
run: python -m pip install --upgrade pip | ||
- name: lint | ||
run: | | ||
pip install -q https://github.com/qiime2/q2lint/archive/master.zip | ||
q2lint | ||
pip install -q flake8 | ||
flake8 | ||
|
||
build-and-test: | ||
needs: lint | ||
strategy: | ||
matrix: | ||
os: [ubuntu-latest, macos-latest] | ||
runs-on: ${{ matrix.os }} | ||
steps: | ||
- uses: actions/checkout@v2 | ||
with: | ||
fetch-depth: 0 | ||
# for versioneer | ||
- run: git fetch --depth=1 origin +refs/tags/*:refs/tags/* | ||
- uses: qiime2/action-library-packaging@alpha1 | ||
with: | ||
plugin-name: q2-diversity-lib | ||
additional-tests: pytest --pyargs q2_diversity_lib |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -73,3 +73,7 @@ node_modules | |
|
||
# VSCode dotfiles | ||
.vscode/* | ||
*.code-workspace | ||
|
||
# project notes | ||
notes/ |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# q2-diversity-lib | ||
|
||
[![Build Status](https://travis-ci.org/qiime2/q2-diversity-lib.svg?branch=master)](https://travis-ci.org/qiime2/q2-diversity-lib) | ||
![](https://github.com/qiime2/q2-diversity-lib/workflows/lint-build-test/badge.svg) | ||
[![Coverage Status](https://coveralls.io/repos/github/qiime2/q2-diversity-lib/badge.svg?branch=master)](https://coveralls.io/github/qiime2/q2-diversity-lib?branch=master) | ||
|
||
This is a QIIME 2 plugin. For details on QIIME 2, see https://qiime2.org. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -6,31 +6,22 @@ | |||||
# The full license is in the file LICENSE, distributed with this software. | ||||||
# ---------------------------------------------------------------------------- | ||||||
|
||||||
import biom | ||||||
import pandas as pd | ||||||
import skbio.diversity | ||||||
import biom | ||||||
from unifrac import faith_pd as f_pd | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
from ._util import _drop_undefined_samples, _disallow_empty_tables | ||||||
from q2_types.feature_table import BIOMV210Format | ||||||
from q2_types.tree import NewickFormat | ||||||
from ._util import (_drop_undefined_samples, | ||||||
_disallow_empty_tables) | ||||||
|
||||||
|
||||||
@_disallow_empty_tables | ||||||
def faith_pd(table: biom.Table, phylogeny: skbio.TreeNode) -> pd.Series: | ||||||
presence_absence_table = table.pa() | ||||||
counts = presence_absence_table.matrix_data.toarray().astype(int).T | ||||||
sample_ids = presence_absence_table.ids(axis='sample') | ||||||
feature_ids = presence_absence_table.ids(axis='observation') | ||||||
|
||||||
try: | ||||||
result = skbio.diversity.alpha_diversity(metric='faith_pd', | ||||||
counts=counts, | ||||||
ids=sample_ids, | ||||||
otu_ids=feature_ids, | ||||||
tree=phylogeny) | ||||||
except skbio.tree.MissingNodeError as e: | ||||||
message = str(e).replace('otu_ids', 'feature_ids') | ||||||
message = message.replace('tree', 'phylogeny') | ||||||
raise skbio.tree.MissingNodeError(message) | ||||||
|
||||||
def faith_pd(table: BIOMV210Format, phylogeny: NewickFormat) -> pd.Series: | ||||||
table_str = str(table) | ||||||
tree_str = str(phylogeny) | ||||||
Comment on lines
+22
to
+23
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Stylistically, this doesn't match what you have done in the unifrac methods, below. I like what you've done in those unifrac methods, by in-lining the |
||||||
result = f_pd(table_str, tree_str) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
result.name = 'faith_pd' | ||||||
return result | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# ---------------------------------------------------------------------------- | ||
# Copyright (c) 2018-2020, QIIME 2 development team. | ||
# | ||
# Distributed under the terms of the Modified BSD License. | ||
# | ||
# The full license is in the file LICENSE, distributed with this software. | ||
# ---------------------------------------------------------------------------- | ||
|
||
import biom | ||
import skbio.diversity | ||
import sklearn.metrics | ||
import unifrac | ||
|
||
from q2_types.feature_table import BIOMV210Format | ||
from q2_types.tree import NewickFormat | ||
from ._util import (_disallow_empty_tables, | ||
_validate_requested_cpus) | ||
|
||
|
||
# --------------------Non-Phylogenetic----------------------- | ||
@_disallow_empty_tables | ||
@_validate_requested_cpus | ||
def bray_curtis(table: biom.Table, n_jobs: int = 1) -> skbio.DistanceMatrix: | ||
thermokarst marked this conversation as resolved.
Show resolved
Hide resolved
|
||
counts = table.matrix_data.toarray().T | ||
sample_ids = table.ids(axis='sample') | ||
return skbio.diversity.beta_diversity( | ||
metric='braycurtis', | ||
counts=counts, | ||
ids=sample_ids, | ||
validate=True, | ||
pairwise_func=sklearn.metrics.pairwise_distances, | ||
n_jobs=n_jobs | ||
) | ||
|
||
|
||
@_disallow_empty_tables | ||
@_validate_requested_cpus | ||
def jaccard(table: biom.Table, n_jobs: int = 1) -> skbio.DistanceMatrix: | ||
counts = table.matrix_data.toarray().T | ||
sample_ids = table.ids(axis='sample') | ||
return skbio.diversity.beta_diversity( | ||
metric='jaccard', | ||
counts=counts, | ||
ids=sample_ids, | ||
validate=True, | ||
pairwise_func=sklearn.metrics.pairwise_distances, | ||
n_jobs=n_jobs | ||
) | ||
|
||
|
||
# ------------------------Phylogenetic----------------------- | ||
@_disallow_empty_tables | ||
@_validate_requested_cpus | ||
def unweighted_unifrac(table: BIOMV210Format, | ||
phylogeny: NewickFormat, | ||
threads: int = 1, | ||
bypass_tips: bool = False) -> skbio.DistanceMatrix: | ||
return unifrac.unweighted(str(table), str(phylogeny), threads=threads, | ||
variance_adjusted=False, bypass_tips=bypass_tips) | ||
|
||
|
||
@_disallow_empty_tables | ||
@_validate_requested_cpus | ||
def weighted_unifrac(table: BIOMV210Format, | ||
phylogeny: NewickFormat, | ||
threads: int = 1, | ||
bypass_tips: bool = False) -> skbio.DistanceMatrix: | ||
return unifrac.weighted_unnormalized(str(table), str(phylogeny), | ||
threads=threads, | ||
variance_adjusted=False, | ||
bypass_tips=bypass_tips) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personal opinion: in-lining function calls in f-strings decreases readability. No need to change, but just thought I would provide my unsolicited advice. Options:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion here, but am curious about why you feel this way. Though I agree that a clearly-named variable (e.g.
table_type
) is almost always more readable than a function call, I feel like the ability to inline clear/simple calls in fstrings reduces clutter. There is the slippery slope toward illegible inline function calls to think about, but I'm not sure that's a reason to avoid it entirely.As for example 2, I think f strings are simply more readable on a fundamental level. Put the variables where you want them. Experienced developers might be used to c-style string formatting, but that syntax is inherently more complex and harder to read regardless of whether you're using variables or function calls.