-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batching inference commit should be reverted and applied part-by-part for community adaptation !!!! #937
Comments
Hi, I'm one of the coauthors of the batching PR and was one of the top contributors in
torch is indeed used for feature extraction , but it does provide more than 8x speedup on cpu and even more on gpu compared to the previous numpy implementation, and more speedup can be squeezed with further optimization. This is the only use of torch after the PR so I don't understand where we got ALL the code to torch. As for the alternative optimized implementation, feel free to give it a try and open a PR, we will be more than pleased to replace torch with a better alternative.
The slowdown is observed on low-resource environments, we are still working to pinpoint the issue and solve it, and while this is a valid use case, it's does not represent the majority of the user base and those who face issues have the option to stick to v1.0.3 for now until its resolved
Whisperx has been inactive for a while indeed, but me and several other faster-whisper contributors are also contributors to whisperx and familiar with both codebases and have the ability to maintain it so rest assured, as for the realtime usage, while it's not a direct use case of whisper, I don't see how the PR affects that as it's totally backwards compatible
The main PR went through a rigorous review process for more than a month and over 200 public comments and reviews from multiple users, The code is being refined and cleaned in several following PRs such as #921 and #936 , regardless of that, you are totally free to leave a review comment with constructive feedback on any PR or even better, open you own PR implementing any changes you see useful or a simpler batching implementation, if the batching PR isn't worth it for your use case, as I mentioned before you have the option to stick to an older version, faster whisper is pretty mature so you won't be missing out on anything 😉
Feel free to add more tests to the test suite, any contribution is welcome At the end, this PR is the result of tens of hours of work volunteered from several people, and while this not being the first issue opened about the batching PR, this is the first one to be so disrespectful and entitled |
Let me give a response to your answers and describe more about how this PR ruins faster-whisper. Before that let me say that by trying to put batching to faster-whisper you made it ineffective/bloated and more complex project to maintain;
You just bring up torch to pip dependency and all the interfaces and other structs on the project started to depend on torch tensor structure. This makes the code bloated with torch. You just changed the core transcription code and put torch all over.
Even if you try to fix this issue, there will be many other cases that will make faster-whisper ineffective with the current design and updates. For example, one may not need batch processing for streaming transcription.
Who are you to consider to define the usage target of faster whisper? realtime or not? You are not the community or authority. Version 1.0.3 provides a decent way to use faster-whisper in this context and you should not have any right to degrade that.
I am looking at the following PRs. Your contributor's team's review does not reflect the community review. Some of the changes are too big in the PR, even GitHub does not display them by default. I have left out my constructive feedback over the PR and this issue; REVERT THIS COMMIT. In this case, if your code includes meaningful parts you can commit them independently reaching to your desired state part by part(if it is worthed)
You should not be the one to redirect people to older versions based on your bad design decisions. You can not condemn faster-whisper to batched inferencing solution or any older version for other usages. You do not have the right to do that. If you want that do that in your whisper/whisper-x repo and leave this one clean and in quality. For any PRs and alternative solutions, this commit should be reverted.
It is not just the tests that ensure code quality and performance, it is also the design and implementation. Even if your suggestion seems viable it is not an answer/solution related to my argument. Clean code and design is needed In the end, spending time/effort on complex/bloated stuff does not make it holy or beneficial. Contrary to that it creates a mess that is hard to maintain. I respect the high quality, lean, and flexible design that the original faster-whisper provides. With something similar to that, you can get the deserved respect. Besides that, you and your team's effort on this repository does not make you any authority or decision maker. |
@MahmoudAshraf97 thanks for your precious time and effort adding this batching support to this project 🙏 I think that low resources and running on cpu is mostly for development case. In real products, mostly powerful gpus are using. So it is better to focus, if we have more resource, can we make it faster like batching algorithm. There are some performance problems on CPU and low resources environment. @aligokalppeker I think it is better to fork this project from previous commit which not includes batching support, and maintain and improve your forked project. We all trying to make open-source projects better for us and others. We spend our precious time. We have to polite and respectful for each other. |
@ibrahimdevs You should advise this to the other guys, If they want to do something so different they can make a separate project that community can benefit from. Your talk comes to the point that ignorance is a bliss, and you may be such a kind. But doing a fork is the easiest approach to avoid this mess. What i am trying to do is keeping the project alive original to its roots. Community will understand the merits of this conservative action sooner or later. btw It is doubtful that you spend your precious time as you have less/no activity in github. So get away from false allegations. |
hello while still in the topic, can
File "/home/dody/playsub/venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 523, in transcribe
Model was trained with pyannote.audio 0.0.1, yours is 3.3.1. Bad things might happen unless you revert pyannote.audio to 0.x. The latest commit has raised multiple module version conflicts that I must manually resolves.... |
@dodysw the first issue means that vad found no speech in the audio |
shouldn't the library handle this situation when there's no speech in the audio? e.g. just do early return instead of crashed. |
Unfortunately another result of the big bulky change of the PR. More and more issue will appear as the changes are too big to test and verify in all scenarios. |
The issue is also another sign of this commit #954 |
@aligokalppeker There may be some issues with the current implementation, but there is quite clearly a PR in which does resolve all the issues as I stated in that #954. Batched inference with the PR changes actually improved the output over sequential inference, so I think this commit and overall approach is for the better. |
Ha, the unhappy customers... Looks like my comment was prophetic. 😜 I still think that merging that PR was not wise decision, some things there are suboptimal and breaking for some users, anyway, no need to be rude about it, in the end you can fork it and maintain "cleaner" repo of FW. |
@Purfview I have seen your comment now as you've referenced it in this issue. It summarizes the whole problem in this PR. There are too many things in a PR that can break things up and even if these stuff are solved, its design makes the faster-whisper; "bulky-whisper". For each feature, alternative ways can be discussed and progressed accordingly to keep the faster-whisper still faster and lean. I think, your "fork" suggestion is more appropriate for the authors as you've also mentioned in your comment. |
I agree that this change should be reverted and we should maintain the spirit of keeping faster-whisper lean with minimal bloat and dependencies. In fact, one of the reasons I decided to use faster-whisper instead of whisperX was that I found whisperX to be bloated and it had lots of code smell. I would rather we don't try to push faster-whisper in whisperX direction. Some of whisperX features fill a specific need for specific kinds of users which is great but I don't think we should push those features down to this repo, especially that whisperX depends on this repo. I'm just putting my opinion here and I do acknowledge the hard work people have put. It's up to the repo owners to decide how to move forward, for now I'm likely sticking to v1.0.3. |
What is going on in this repository? Would someone be able to explain it? @MahmoudAshraf97 has closed the PR related to this issue without notification and merged some PRs without any review process. Even if there is community support for reverting the recent changes pushed by a bulk commit, community response is ignored with these actions. |
@aligokalppeker Btw, I didn't followed much what is going on here in last half year, now I'll have more time at this. |
As the commit itself is inspected, it is a "really" messy update, aiming to improve the performance but in fact, pushes faster-whisper to a very bad state;
I do not know, how the PR is updated, but it should not have been merged;
This PR will be the doom of faster whisper, forcing all to remain in older versions or dump the project(or fork from 1.0.3).
It should be reverted and rearranged to come in incrementally
The text was updated successfully, but these errors were encountered: