tl;dr:
Click any launch binder
badge on this page to run command line-based PatMatch inside your browser.
Alternatively, use Cyverse with the launch vice
badge under Running PatMatch on CyVerse/VICE below.
PatMatch: a program for finding patterns in peptide and nucleotide sequences.
This repository is for running perl-based PatMatch in Jupyter environment provided by MyBinder.org.
Additionally, having PatMatch working inside the Jupyter environment with interactive Python adds some convenient features that are illustrated. A utility script for moving command line-based PatMatch results into Python is demonstrated.
The PatMatch software will be installed already in each active session launched from this repo. The PatMatch software is available directly from the authors at ftp://ftp.arabidopsis.org/home/tair/Software/Patmatch/.
The PatMatch software is described in this scientific article.
Users should cite:
PatMatch: a program for finding patterns in peptide and nucleotide sequences.
Yan T, Yoo D, Berardini TZ, Mueller LA, Weems DC, Weng S, Cherry JM, Rhee SY.
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W262-6. PMID: 15980466
The copyright information for the software is available in the patmatch_1.2
directory of actively running binders.
Clarifying Software Attribution: I, Wayne, am not involved in the PatMatch software at all. Those listed above are the developers and distributors of PatMAtch. See their materials. I simply set up this repository to make the software useable on the command line without installation headaches.
I, Wayne, did code a Python-based utility for use with the results from command line PatMatch; it is available here and utilized in the notebooks in this repository to process the results and allow easily converting the results to other forms.
This repository is set up to allow running the command line version of this software after pressing the launch binder
button above or below. Alternatively, use the launch vice
badge under Running PatMatch on CyVerse/VICE below.
The target use case is when you want to run pattern search on sequences not available via the web-based PatMatch offerings (see below) or you want to integrate the results of PatMatch with Python code easily. Refer to this this page for PatMatch's 'Supported Pattern Syntax/Examples/Limits' information.
In the notebooks that can be launched, I have added some examples illustrating how to use the program and process the results easily with Python and convert to other forms.
There are a number of places where web-served versions of software are already set up to work with specific genomes:
Pattern Matching with PatMatch web-based tools
- S. cerevisiae genomes
- Human genome
- Arabidopsis genome
- S. lycopersicum
- Aspergillus genomes
- Candida genome
- E. coli genome
- B. subtilis genome
- Pseudomonas genome
Related web-based offerings:
- fuzzpro/fuzznuc emboss prosite-style pattern matches in complete Genomes DataBase
- RSAT - genome-scale dna-pattern
Related command line-based software offerings:
- seqkit allows for degeneracy in match as illustrated here.
The VICE (Visual Interactive Computing Environment) in the CyVerse Discovery Environment enables users to run Jupyter Lab in a more powerful and more persistent environment. To use VICE, sign up for a free NSF-supported CyVerse account.
The VICE environment supports large-scale analyses. Users can upload and download files, and save and share results of their analyses in their user accounts.
Follow these steps to run Jupyter Lab on VICE
If you are interested in the accompanying tutorial-like notebooks that you see in the sessions launched from here, the code for them is available outside of the active Jupyter session at this repository here. In an effort to streamline development, they have been moved there. Importantly, that allows the version of the PatMatch software to remain 'locked' in the Cyverse app, yet allow me to improve the accompanying material. This also has the effect that the MyBinder-served core image doesn't need building often.
This repository is set up to make use of the binder service offered by MyBinder.org. See their site for more information about Binder.
Unlike with circos which was downloadable via MyBinder.org, MyBinder.org doesn't allow FTP conntections, and so I cannot simply retrieve the software from the FTP server via curl while building the image. And so instead I added it to repo. Then it gets unpacked and compiled via running postBuild
as the last part of building of the image from the repository; that built image will get stored by MyBinder and the session launched from the image will already have PatMatch available. The accompanying materials are retrieved when the session starts as directed in the start
script for the Binder system or in the entry script for the Cyverse VICE app.
Click this button below to begin using PatMatch (or see the Cyverse section if you prefer to use it there):