Skip to content

ThiagoMaria-SecurityIT/AI-Syllable-and-Text-to-Speech-Tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

AI Syllable and Text-to-Speech Tool

Python Version Gradio Framework AI Model Built with Manus AI

Welcome! This project is an educational tool designed to show how modern AI and classic programming techniques can be combined to create a useful application. It started as a simple desktop app and evolved into a powerful, web-based AI tool hosted on Hugging Face.

image

The application has two main functions:

  1. A Syllable Splitter: It takes an English word, splits it into its phonetic syllables, and lets you hear them spoken.
  2. A Direct Text-to-Speech (TTS ) Tool: It can take any English text and convert it into high-quality speech using an AI model.

This repository is perfect for learners interested in Python, GUI development, and how to use AI models in a practical project.


Index

  1. Live Demo
  2. Features
  3. How It Works: The Technology
  4. How to Run This Project
  5. Project Evolution: From Desktop App to AI Web App
  6. AI Transparency: A Note on Collaboration

Live Demo

You can try out the final, web-based version of this application live on Hugging Face Spaces:

AI Syllable and Text-to-Speech Tool Live


Features

  • Two Tools in One: A dedicated tool for syllable analysis and a general-purpose Text-to-Speech engine.
  • AI-Powered Speech: Uses the high-quality Coqui TTS model for natural and clear audio generation.
  • Interactive Interface: Built with Gradio, the interface is user-friendly and allows for editing the syllabified text before generating audio.
  • Web-Based and Accessible: As a Hugging Face Space, the tool requires no installation and can be used by anyone with a web browser.

How It Works: The Technology

This project combines several key libraries to achieve its functionality:

  • Gradio: Used to build and host the interactive web interface. It's a fantastic Python library for creating demos for machine learning models.
  • Pyphen: A library for splitting words into syllables. It uses dictionary-based rules to ensure phonetic accuracy.
  • Coqui TTS (🐸 TTS): A powerful, open-source library for Text-to-Speech. We use one of its pre-trained English models to convert text into spoken audio.
  • PyTorch: The underlying machine learning framework that runs the Coqui TTS model.

The application is structured with a clear, two-step workflow for the syllable tool, making it easy to see the intermediate result before hearing the final audio.


How to Run This Project

Since the final version is a web app, the easiest way to use it is via the Live Demo link.

However, if you wish to run the project on your own computer to experiment with the code, Hugging Face makes this very simple:

  1. Go to the project's AI Syllable and Text-to-Speech Tool Live.
  2. Click on the three dots ( • • • ) menu icon at the top-right of the page.
  3. Select "Run locally".
  4. You can follow the instructions to run it locally on your machine. This process handles the dependencies and setup for you.

Project Evolution: From Desktop App to AI Web App

This project didn't start as a web app. Its journey is a great lesson in software development:

  1. Initial Goal: Create a simple Python script to split syllables.
  2. Desktop App: We first built a desktop application using Tkinter and a basic, offline TTS engine (pyttsx3).
  3. The Limitation: We discovered that the basic Windows TTS voices struggled to pronounce isolated syllables correctly (e.g., reading "cor" as "C-O-R").
  1. The Pivot to AI: To solve this, we decided to use a modern AI-powered TTS model. We chose to host it on Hugging Face to avoid requiring users to have powerful hardware.
  2. Final Version: The project evolved into a full-fledged Gradio web application, which is more powerful, accessible, and provides much higher-quality results than the original desktop app.

This evolution shows how encountering limitations can lead to better, more modern solutions.


AI Transparency: A Note on Collaboration

This project was developed collaboratively between a human developer and Manus, an AI agent from the Manus team.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published