Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving karyotype to anoph #702

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Moving karyotype to anoph #702

wants to merge 4 commits into from

Conversation

jonbrenas
Copy link
Collaborator

Addresses #698.

The move is done and all tests pass locally but more tests need to be added (for funestus, for example).

@jonbrenas jonbrenas marked this pull request as draft December 11, 2024 14:52
@jonbrenas
Copy link
Collaborator Author

I added tests to test_ag3.py and test_af1.py to check that errors were indeed raised when expected. I am not really satisfied yet, though.

@jonbrenas jonbrenas marked this pull request as ready for review December 11, 2024 16:51
Copy link
Member

@alimanfoo alimanfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jonbrenas, thanks for this, a few suggestions...

import allel # type: ignore

from numpydoc_decorator import doc
from ..util import check_types, _karyotype_tags_n_alt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If _karyotype_tags_n_alt is only used in this module, perhaps it should be moved here.

from .karyotype_params import inversion_param


class AnophelesKaryotypeData(AnophelesSnpData):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class AnophelesKaryotypeData(AnophelesSnpData):
class AnophelesKaryotypeAnalysis(AnophelesSnpData):

Nit, suggest slight naming change, then will hopefully also make sense if you later on add in here some functions for calculating inversion allele frequencies.

Comment on lines +26 to +27
# If provided, this analysis version will override the
# default value provided in the release configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# If provided, this analysis version will override the
# default value provided in the release configuration.

Cruft.


@check_types
@doc(
summary="Load tag SNPs for a given inversion in Ag.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
summary="Load tag SNPs for a given inversion in Ag.",
summary="Load tag SNPs for a given inversion.",

Comment on lines +39 to +42
if not self._inversion_tag_path:
raise FileNotFoundError(
"The file containing the inversion tags is missing."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileNotFoundError isn't quite the right exception class here.

Suggested change
if not self._inversion_tag_path:
raise FileNotFoundError(
"The file containing the inversion tags is missing."
)
if self._inversion_tag_path is None:
raise NotImplementedError(
"No inversion tags are available for this data resource."
)


@check_types
@doc(
summary="Infer karyotype from tag SNPs for a given inversion in Ag.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
summary="Infer karyotype from tag SNPs for a given inversion in Ag.",
summary="Infer karyotype from tag SNPs for a given inversion.",

Comment on lines +7 to +10
inversion_param: TypeAlias = Annotated[
Literal["2La", "2Rb", "2Rc_gam", "2Rc_col", "2Rd", "2Rj"],
"Name of inversion to infer karyotype for.",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values will only be valid for Ag. Af would have different possible values. Suggest broadening here to just a string.

Suggested change
inversion_param: TypeAlias = Annotated[
Literal["2La", "2Rb", "2Rc_gam", "2Rc_col", "2Rd", "2Rj"],
"Name of inversion to infer karyotype for.",
]
inversion_param: TypeAlias = Annotated[
str,
"Name of inversion to infer karyotype for.",
]

Comment on lines +87 to +98
if inversion == "X_x":
with pytest.raises(TypeError):
af1.karyotype(
inversion=inversion, sample_sets="AG1000G-GH", sample_query=None
)
else:
with pytest.raises(FileNotFoundError):
af1.karyotype(
inversion=inversion,
sample_sets="1229-VO-GH-DADZIE-VMF00095",
sample_query=None,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All inversion parameter values should fail with NotImplementedError I think.

assert all(df["karyotype_2La"].isin([0, 1, 2]))
assert all(df["karyotype_2La_mean"].between(0, 2))
if inversion == "X_x":
with pytest.raises(TypeError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a ValueError I think.

@alimanfoo
Copy link
Member

The other question here is if/how to get unit tests using the simulated data. It could be tricky because the simulated data is not guaranteed to generate data at the tag SNP positions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants