AV Tools

A collection of CLI tools for audio and video processing.

Features

Audio transcription and diarization
Transcript formatting
Video to audio conversion
YouTube video downloader

Prerequisites

Python 3.11
FFmpeg v7 full build and add the bin folder to the PATH environment variable.

Installation

To use diarization feature, you must have a Hugging Face account and follow these steps:

Accept pyannote/segmentation-3.0 user conditions

Accept pyannote/speaker-diarization-3.1 user conditions

Create access token at hf.co/settings/tokens.

avtools can be installed using pipx. If you don't have pipx installed, you can install it using pip (pip install pipx and python -m pipx ensurepath) or brew (brew install pipx and pipx ensurepath).

To install the package using pipx, run the following command:

pipx install git+https://github.com/jorgeandrespadilla/avtools.git

To upgrade the package, run the following command:

pipx upgrade avtools

Usage

avtools <command> [options]

For more information on the available commands, use the --help argument:

avtools --help

Transcribe

avtools transcribe -i <path_to_audio_file>.mp3 -o <path_to_output_file>.json

To use diarization feature, add the --hf-token argument with the access token. We do not recommended to use this feature for large audio files.

Convert Video to Audio

avtools video-audio -i <path_to_input_video_file>.mp4 -o <path_to_output_audio_file>.mp3

Convert Transcripts to Different Formats

Only available for transcripts generated in JSON format.

To convert a JSON transcript to a subtitle file or plain text file, use the following command:

avtools format -i <path_to_input_json_file>.json -o <path_to_output_file>.srt

Supported output formats:

srt
txt
vtt

Download YouTube Video

avtools youtube-download -u <youtube_video_url> -o <path_to_output_file>.mp4

To download the video transcript, add the --transcript argument with the language code (e.g. en for English).

avtools youtube-download -u <youtube_video_url> -o <path_to_output_file>.mp4 --transcript=<language_code>

Experimental Features

How to use Flash-Attention with avtools transcribe command?

Install it via pipx runpip avtools install flash-attn --no-build-isolation.

We only recommend using Flash-Attention if your GPU supports it.

Contributing

Development

See the CONTRIBUTING.md file for more information on how to contribute to this project.

Additional Information

The following resources may be helpful when solving issues related to PyTorch package installation with Poetry:

Due to the way PyTorch is built, the source URLs have to be hard-coded in the pyproject.toml file to avoid installation issues (to support new Python versions, we should add more URLs to the torch packages). This is a workaround to avoid issues when working with private repositories.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
.vscode		.vscode
assets		assets
avtools		avtools
data		data
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AV Tools

Features

Prerequisites

Installation

Usage

Transcribe

Convert Video to Audio

Convert Transcripts to Different Formats

Download YouTube Video

Experimental Features

Contributing

Development

Additional Information

License

About

Releases 6

Languages

License

jorgeandrespadilla/avtools

Folders and files

Latest commit

History

Repository files navigation

AV Tools

Features

Prerequisites

Installation

Usage

Transcribe

Convert Video to Audio

Convert Transcripts to Different Formats

Download YouTube Video

Experimental Features

Contributing

Development

Additional Information

License

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 6

Languages