diff --git a/README.md b/README.md index 49c7a95..08e7761 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,20 @@ -rfasta: Rust-powered protein FASTA parser library and CLI. +# rfasta: Rust-powered protein FASTA parser library and CLI. -rfasta is designed for bioinformaticians and protein scientists who need a fast, +**rfasta** is designed for bioinformaticians and protein scientists who need a fast, reliable tool for parsing, cleaning, and manipulating protein sequence FASTA files. rfasta is a direct port of the python package [protfasta](https://github.com/holehouse-lab/protfasta) into rust. This greatly improves the performance of parsing, cleaning, and manipulation of LARGE protein sequence fasta files such as those from uniref. + + +## Changelog + +# v0.1.0-beta (Initial Release) +# Initial beta release of rfasta. +- Core functionality for: + - Parsing: Read and interpret protein FASTA files efficiently. + - Cleaning: Remove invalid entries and ensure sequences conform to biological standards. + - Manipulation: Efficient fasta sharding operations on large protein sequence fasta files. +- Rust CLI integration for command-line use cases. +- Python bindings via PyO3 for seamless Python library integration. +- High performance with optimized parsing for large-scale FASTA files (e.g., UniRef datasets). +- Early-stage development—additional features, documentation, and pypi deployment to follow in subsequent releases.