AllTalk v2 Download Details & Discussion #245
Replies: 105 comments 314 replies
-
Installing and Testing now, Amazing job. Will post results / comments soon |
Beta Was this translation helpful? Give feedback.
-
All good here (windows 10). Thank you for your hard work. It looks great! Love being able to easily switch between finetuned xtts models. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your work. Testing it on RunPod Ubuntu. Installation worked fine but running it I get The "Running in Docker" is strange as I don't have docker installed. After manually editing the script.py I got it to work. Unfortunately I can't get DeepSpeed to work. It says it is installed Can't get Gradio UI to work since RunPod creates a Cloudflare Tunnel and afaik there is no way to specify a a custom API / Gradio domain during the AllTalk setup. I still hope a future version of AllTalk can nicely integrate with running the application in the cloud (Runpod, Collab, etc) |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Hey, thanks for the beta! One issue I noticed is not being able to access the gradio page from other devices in the network. The api page and TTS generator page is accessible using 192.168... but not gradio, not sure if this is an issue on my end. Everything is accessible from the host computer on 127.0 etc |
Beta Was this translation helpful? Give feedback.
-
Hi, thank you for the great work. I was trying to create a dataset for Arabic language, however, I'm getting the following error: If I switch the language to English (but still using an Arabic audio files), it works fine, generates wavs correctly, and actually translate the sentences in metadata_train & metadata_eval to English. So it's understanding the language fine, but there seems to be an issue with file/sentence generation in native Arabic. |
Beta Was this translation helpful? Give feedback.
-
I ran into a couple of issues running the beta as an Oobabooga plugin. The first is that on first startup, it looked for
In my specific case, it should have looked for that file in Editing the script_path variable with the correct path fixed the issue on the next startup. The second issue is tricker and I haven't been able to figure it out. Despite seemingly having all the requirements installed correctly, the app complains that there is a missing Gradio "system" module:
Note that I've tried installing the requirements first through atsetup.bat, and subsequently via Here are the relevant sections from running the diagnostics:
(Another side note, the message above mentions "you should see 'text-generation-webui' listed in the path of the above folders." That is no longer correct, because dashes in the folder name now cause AllTalk to throw an error on startup. Could be confusing to some users.) Let me know if you'd like me to try anything specific to troubleshoot. |
Beta Was this translation helpful? Give feedback.
-
From my experience with v1, and looking at the screenshot of v2, this is going to be phenomenal. So glad you're doing all of this. Thank you! |
Beta Was this translation helpful? Give feedback.
-
Hello, the second variation is really great. By the way, is it possible for you to make it so that it can serve multiple clients simultaneously rather than sequentially as the requests come in? So, can it be asynchronous? Of course, if there are enough resources, but could it be done even through Docker? Thanks. |
Beta Was this translation helpful? Give feedback.
-
Hey, V2 is great. Here are some suggestions to further streamline the user interface. The contents of the AllTalk v2 Beta, Generate Help, API Endpoints & Dev, and About This Project tabs should all be moved under the Documentation and Help section. TTS-generation settings can also be shifted under Global Settings. Please consider. |
Beta Was this translation helpful? Give feedback.
-
Hey, great work on the project, I've been using v1 for a few days now and have started moving towards v2 with my project totally-real-news-bot so I can use RVC models. I'm using alltalkv2 in TGWUI mode I'm using the API with piper TTS, that works great, generation is muiltiple times faster than coqui (my pc build is chinese e-waste), but when I use RVC, it seems to start the conversion process, it finds a .pth model, VRAM usage goes up but then my CPU shoots to 100% like its processing something. Is it possible RVC conversion is in cpu mode or could my setup be incorrectly configured? |
Beta Was this translation helpful? Give feedback.
-
I tested the project, it's amazing! works perfectly with English. |
Beta Was this translation helpful? Give feedback.
-
I've been using v2 for a while now and it's fantastic. I use the standalone version, usually with SillyTavern. I did have some difficulty getting the SillyTavern settings to work. Something about having so many voice/narrator dropdowns and having to match them with selections in the webui, it's confusing to me. It would be great to have some way to save some presets - for example, when selecting preset A, it populates alltalk character, narrator, and rvc character, narrator. Can't wait for the large generator to be added to the main webui. That is my main wish, together with being able to import .txt files into it or, better yet, process an entire folder of .txt files. Wow!! Having RVC applied at the same time is so much less hassle than exporting .wav, then running it through RVC webui manually... Last thing: are you aware of any XTTS2 finetunes for accents or gender (in English)? I've googled a lot and been to websites that claim to host tons of models but have found only a handful of XTTS2 finetunes, and not a single interesting one. Thx! |
Beta Was this translation helpful? Give feedback.
-
Would you be looking to add MeloTTS to v2 at some point? Seems like one of the better (and faster) TTS models that you can also train locally. |
Beta Was this translation helpful? Give feedback.
-
Can't get it work :(
|
Beta Was this translation helpful? Give feedback.
-
@erew123 - Do you have any ideas on that deal I might have mentioned where AllTalk continues generating the output file indefinitely? The line of text I fed it to generate will typically be the last thing in the console, with no total time given afterward and no auto-playback of the newly generated file(s), but it will make them back to back, with the GPU hanging at 98% or something (which is normal when it's making one). It will actually start back making them even if I change models or refresh and I don't believe it will accept new text. The only way out is to CTRL-C and restart AllTalk. It seems that using that 7852 interface is where it occurs. I used only the 7851 last night and kept it from happening. - BTW- This is all XTTS engine stuff I'd put it in with the issues, but if you or anyone else hasn't hit it, I'll just figure it's one of the many joys of running on a 2GB GT 1030 card. ;-) |
Beta Was this translation helpful? Give feedback.
-
i'd like to know the way to change the currently set tts engine outside of the webui, because i'm getting a persistent crash situation caused by parler. once parler is set in the webui, alltalk beta will crash, and upon re-starting, alltalk still thinks it's supposed to run parler due to some persistent data. where is that config file so i can just figure out a way to edit the currently set engine. it seems silly to have to reinstall just because some config file is out there. unless it's written to assembly language and not plain text. i looked in the obvious file called config in the main directory, but it doesn't look like there is any line in that document to save that setting. i asked this same question before but i can't find my question so i am unable to respond if there was a thread, sorry. i followed the force reinstall thing and that was a start, just rather have the config file name so i can dig it up. here's the traceback stuff for when i try to start alltalk beta on linux after trying to change to parler in the webui, it crashing, and subsequently preventing the launch of alltalk beta.
so, long story short, changing engines to parler breaks alltalk beta, unless there's some human readable config file to change the currently set engine. |
Beta Was this translation helpful? Give feedback.
-
I just tried installing this clean (removed the old alltalk folder). ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. I updated pip and re-ran requirements update again (3rd time) but get the same message. Any hints what to do, appreciated. Thanks! |
Beta Was this translation helpful? Give feedback.
-
at one point i was presented with the option to install tts engines, but i skipped it thinking i already had the model. i didn't. so i re-ran alltalk setup hoping i would be presented with the option again, but it didn't happen. so i ran alltalk to see if i would get the option, but that didn't happen, I get
but it doesn't continue, it just 'keeps trying' for 240 seconds and then eventually times out and exits. so now i'm in a similar situation, i selected xtts in the gradio webui and now it thinks it is supposed to load that, but it doesn't exist so nothing happens. after this user error is made, then gradio or webui doesn't work anymore because alltalk times out before any webui goes online. how do i get the screen that asks me which engines i want to install? i think it only shows that on first run, but not again. is that true? Perhaps, like Automatic1111 stable diffusion webui, there ought to be a script file that's editable that is used to coordinate critical options, because if alltalk has an error the webui goes to an unresponsive zombie state. and after that first-run, the webui no longer will be available to undo changes, making some of the configuration changes potentially fatal to the entire installation. there really should be no tweak a user can make in the webui that breaks the program permanently. maybe as a temporary fix, when there is an engine error, a toggle can be made that puts alltalk into a first-run state again? or at least when alltalk setup is run, it can delete the persistent config data and place alltalk back into a first-run state again. because at present, running setup does not affect user configurations that were made at an earlier time. more wit's end stuff: i had a copy of the xtts model in alltalk version 1. "score" i thought. so I copied that into the models directory, after creating folder called xtts of course. boom! except anti-climax because the xtts version in old alltalk is xtts2_2.0.2 and alltalk2 won't load because it wants xtts2_2.0.3. (hugging face doesn't seem to have 2.0.3) -- but I'm so clever, i renamed the folder of 2.0.2 to 2.0.3 and gradio loaded! -- it loads into cuda but womp womp, generating "hello" never happens, it's just running the timer forever. ok, got xtts working this way: renamed the xtts model's folder to 2.0.3, then i could open gradio again. i then changed engines to piper, which of course was not found, but this freed up xtts long enough for me to go back into the alltalk/model/xtts folder and rename xtts to it's proper name 2.0.2. then i loaded the xtts engine, got the "xtts_2.0.3 not found" error, and then at some point i was able to find 2.0.2 listed in the models. loaded it. now alltalk2 generates "hello". i will resist every urge to try another engine lest alltalk get stuck in a non working state until reinstallation is needed. |
Beta Was this translation helpful? Give feedback.
-
Quick question to people who have standalone AllTalk v2 working Did you downgrade transformers to version 4.40.2 for XTTS streaming support? Instructions for standalone don't specify what to do at this point in setup or what option is most common or most recommended. Thanks |
Beta Was this translation helpful? Give feedback.
-
To anyone here, Maybe a dumb question, but is there definitive info anywhere on which of the available XTTS models is good (or bad) at what? It seems like every time I try to clone a voice, I forget which one I got good results with and end up having to do a bunch of model switching and listening tests. -Thanks! |
Beta Was this translation helpful? Give feedback.
-
Hi @erew123, This is a really fun and good new tts engine can it be added to Alltalk? https://github.com/SWivid/F5-TTS Let's goo! F5-TTS 🔊
Diffusion based architecture:
|
Beta Was this translation helpful? Give feedback.
-
Hello, I have this question, sometimes in the xtts model the voiceover says something completely different at the end of the text, some random words that are not there at all. Is there any option that I need to limit to eliminate this completely? (should I decrease this option or increase it?) |
Beta Was this translation helpful? Give feedback.
-
I'm having an issue with finetuning, and I really could use some help! I do the entire process, but after I compress and move the files, the "wavs" folder is completely empty. I never get any errors or anything up to that point, but I simply can't get any audio files to show up in that folder. I have tried different voices and files, yet this always happens. |
Beta Was this translation helpful? Give feedback.
-
@erew123 Great job incorporating f5tts! I think it would be nice if we could have whisper extract the text from the reference wave file and populate the "reference text" box. Then just make adjustments to the text as necessary. |
Beta Was this translation helpful? Give feedback.
-
Dear all (whomever this ends up getting sent to, I have no idea with Girhub) AllTalk v2 is up (still in the BETA area, so same/normal instructions). Install - https://github.com/erew123/alltalk_tts/wiki/Install-%E2%80%90-Standalone-Installation REINSTALL THE REQUIREMENTS with ATSETUP (it will tell you off if you dont anyway) It has not been fully validated on:
There may be a bug here or there, but there's a lot of extra error catching and highly detailed debugging options There is still work to do on tidy up some code in the TTS engines, though XTTS now processes and stores latents (its integration code was the only one I re-wrote, as a template for the others to be written by/part of the push to make it easier for other TTS engines to be added here but I still have to finish writing a WIKI document to explain the other bits.... no damn time in the day! There is a fallback if someone ever needed to downgrade to the last beta version but still be able to upgrade in future. There are some new features, there is EXTENSIVE newly written help on every page, so please read the help or WIKI before asking me questions. The interface is cleaned up This has been ??? hundred's of hours of work, so please be gentle with telling me you have found a bug or that I haven't included X feature or something yet. I'm quite tired from this (and dealing with my unwell family situation) and will do a few more bits yet, but also need a little break here and there. Ill leave this here in case anyone feels generous 💖 Sponsor this Project on Ko-fiOtherwise, enjoy! Thanks |
Beta Was this translation helpful? Give feedback.
-
Hi, |
Beta Was this translation helpful? Give feedback.
-
@erew123 Hello, can I somehow use .pth and .index files to create text to audio? |
Beta Was this translation helpful? Give feedback.
-
I know that there's a known issues section, and the issue I'm having is in there, it's the crypt_E_no_revocation_check issue. however after trying to update my PC, check my firewall settings, and internet connectivity, as well as updating my root certificates as a last ditch attempt, I have had no luck with getting this issue to resolve. I know my system clock is correct for my timezone because everything matches my phone. I wouldn't even bother trying to install deespeed since I plan to use F5-TTS. but alltalk won't make me the start batch file so I can open the webui without moving past this step it seems. I can't even try to see if somehow installing deepspeed manually would do something as the only versions on the manual download page for deepspeed don't match my 12.3 version of CUDA. I did notice towards the beginning of the command terminal, there was this line after attempting to reinstall the requirements, The following packages will be SUPERSEDED by a higher-priority channel: certifi conda-forge/noarch::certifi-2024.12.1~ --> pkgs/main/win-64::certifi-2024.12.14-py311haa95532_0 |
Beta Was this translation helpful? Give feedback.
-
AllTalk v2 is out of BETA (November 24th 2024)
(as far as I am concerned)
Announcement here: #245 (comment)
Please be aware that I am STILL ongoing dealing with what I will call "an urgent family matter", unfortunately I have very unwell family members and I am supporting/caring for them. I will let peoples imagination go wherever to figure out what I mean by that. As such, I will be intermittently 100% unavailable to do anything here on GitHub/code/provide support. This situation may be ongoing for me for months to come.
Please also be clear, I am 1x person, not a company, business etc. There isn't a team here to support/deal with everything.
So, please read the built in documentation that's now inside AllTalk, Im sure it will cover 95% of questions. Also please read the extensive WIKI here for some more in-depth answers and topics not covered in the built in documentation, covering topics such as:
Please have fun with AllTalk.
Thanks
Erew123
💖 Sponsor this Project on Ko-fi
Beta Was this translation helpful? Give feedback.
All reactions