Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases

Submitting Author: Semidán Robaina (@Robaina)  
Package Name: __Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases__
One-Line Description of Package:  Query sequence database by HMMs arranged in predefined synteny structure
Repository Link (if existing):  https://github.com/Robaina/Pynteny

---

## Description

- Include a brief paragraph describing what your package does:

Pynteny is Python tool to search for [synteny](https://en.wikipedia.org/wiki/Synteny) blocks in (prokaryotic) sequence data through [HMMs](https://www.bioinformatics.org/wiki/Hidden_Markov_Model) of the ORFs of interest and [HMMER](http://hmmer.janelia.org/). By leveraging genomic context information, Pynteny can be employed to decrease the uncertainty of functional annotation of unlabelled sequence data due to the effect of paralogs. Pynteny can be accessed (i) through the command line, (ii) as a Python module or (iii) as a (locally served) web application.

## Scope 

- Please indicate which [category or categories][PackageCategories] this package falls under:
	- [ ] Data retrieval
	- [x] Data extraction
	- [ ] Data munging
	- [ ] Data deposition
	- [ ] Data visualization
	- [ ] Reproducibility
	- [ ] Geospatial
	- [x] Education
	- [ ] Unsure/Other (explain below)
        
- Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:

Pynteny's main objective is to provide a means to query NGS (unannotated) sequence databases, such as metagenomic/metatranscriptomic datasets using syntenic blocks (i.e. spatial arrangements of genes) rather than single target genes/protein domains. In this sense, I would classify Pynteny within Data Extraction. 

On the other hand, Pynteny can also be employed in microbiology / genetic courses. To this end, it provides a web graphical interface (Streamlit app) to facilitate interaction. We have successfully employed Pynteny in some of our microbiology courses at the University of La Laguna. Hence, I think tagging Pynteny within "Education" may be appropriate.

- Who is the target audience and what are the scientific applications of this package?  

Pynteny was designed to be used by researchers working with large, unannotated sequence databases, such as those typically encountered in metagenomic analyses. It can be accessed through a command line interface or easily integrated into pipelines as a Python package. Pynteny can also be used through a graphical interface running locally in the browser, which is more suitable for educational purposes.

- Are there other Python packages that accomplish similar things? If so, how does yours differ?

To extent of my knowledge, there isn't any Python package that provides the functionality provided by Pynteny.

- Any other questions or issues we should be aware of:

I submitted this package for publication at JOSS a few days back. The submission is currently under consideration for scope.

**P.S.** *Have feedback/comments about our review process? Leave a comment [here][Comments]


[PackageCategories]: https://www.pyopensci.org/contributing-guide/open-source-software-peer-review/aims-and-scope.html?highlight=data#package-categories

[Comments]: https://github.com/pyOpenSci/governance/issues/8


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases #65

Description

Scope

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases #65

Description

Description

Scope

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions