-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python benchmark script #1298
Python benchmark script #1298
Conversation
Example output
|
If anyone has notes on clean-up or what is required let me know! @bobqianic @ggerganov |
I ran the examples from the same samples/jfk.wav obtained from this script on an Apple M2 MAX with 32GB RAM. I was mostly interested in the relative timings to see if the models lined up in the same ranking order as the M1 results above.
|
|
Ill look into that thanks. |
Because medium and medium.en are very close in timings, I found that for some longer files their relative speeds interchange but the lambda function only appears to sort them if medium is faster than medium.en; if it's slower, it doesn't sort them! Unfortunately - my bad, sorry - adding the float() doesn't fix the problem either, so it's got something to do with how the lambda works. If I use samples/gb0.wav for example, this is what I get on my M2 Max (38 core) 32GB:
|
Actually it's a typo in the field name/key: the lambda needs to be |
@pudepiedj nice catch, fixed. @ggerganov is the root of the project appropriate for this? Feels like it should go elsewhere but I dont want to confuse the existing benchmarking implementation. |
Overall, it's pretty good. I'm going to test it now. If there are no issues, we can merge this PR immediately. |
I suddenly thought that actually, you can create your own Python package and then upload it to |
I added the option to specify the number of threads, processors and sample file as command-line parameters instead of hard-coding them, if that's of any interest. No harm if not.
All my experiments suggest it's fastest with |
This holds true for OpenBLAS as well.
Yes, those threads are |
@pudepiedj I like the argument input but it would need to handle lists, a lot of the reason I added those was for testing different processor and thread counts. across models in the same run |
Maybe for now we just add the file input? |
Obviously your call. It's hard to set a universal default for the file entry that will suit all local file structures, that's the only downside, so maybe we just stick with your original plan. I entirely see your point about being able to test lists of values, and I don't think it's possible to enter a list as a command-line parameter (is it?) although there's almost certainly a workaround if it was deemed desirable. |
I've never done it, and I haven't rigorously checked it for all eventualities, but Python is amazing. It's not very pretty, but with a bit of help from GPT-3.5-turbo this allows either a list (-l) or a single number of threads (-t) (lists taking precedence) to be entered from the command line in the form
Here's the additional code
Sample output:
|
@bobqianic ready for review! |
Both good points. |
I dont have the quantized models in the list for this reason. Would be nice once they are functioning to add them in, but simple enough and I think for now cleaner to leave them out. |
Yes that's perfectly reasonable. I was just trying to anticipate the situation where someone adds their own edit and wonders whether the problem is theirs or in the code (because I often get errors and am not sure whether it is "me" or "it" given the vagaries of installation). |
* Create bench.py * Various benchmark results * Update benchmark script with hardware name, and file checks * Remove old benchmark results * Add git shorthash * Round to 2 digits on calculated floats * Fix the header reference when sorting results * FIx order of models * Parse file name * Simplify filecheck * Improve print run print statement * Use simplified model name * Update benchmark_results.csv * Process single or lists of processors and threads * Ignore benchmark results, dont check in * Move bench.py to extra folder * Readme section on how to use * Move command to correct location * Use separate list for models that exist * Handle subprocess error in git short hash check * Fix filtered models list initialization
is there benchmarking of memory usage? |
* Create bench.py * Various benchmark results * Update benchmark script with hardware name, and file checks * Remove old benchmark results * Add git shorthash * Round to 2 digits on calculated floats * Fix the header reference when sorting results * FIx order of models * Parse file name * Simplify filecheck * Improve print run print statement * Use simplified model name * Update benchmark_results.csv * Process single or lists of processors and threads * Ignore benchmark results, dont check in * Move bench.py to extra folder * Readme section on how to use * Move command to correct location * Use separate list for models that exist * Handle subprocess error in git short hash check * Fix filtered models list initialization
* Create bench.py * Various benchmark results * Update benchmark script with hardware name, and file checks * Remove old benchmark results * Add git shorthash * Round to 2 digits on calculated floats * Fix the header reference when sorting results * FIx order of models * Parse file name * Simplify filecheck * Improve print run print statement * Use simplified model name * Update benchmark_results.csv * Process single or lists of processors and threads * Ignore benchmark results, dont check in * Move bench.py to extra folder * Readme section on how to use * Move command to correct location * Use separate list for models that exist * Handle subprocess error in git short hash check * Fix filtered models list initialization
* Create bench.py * Various benchmark results * Update benchmark script with hardware name, and file checks * Remove old benchmark results * Add git shorthash * Round to 2 digits on calculated floats * Fix the header reference when sorting results * FIx order of models * Parse file name * Simplify filecheck * Improve print run print statement * Use simplified model name * Update benchmark_results.csv * Process single or lists of processors and threads * Ignore benchmark results, dont check in * Move bench.py to extra folder * Readme section on how to use * Move command to correct location * Use separate list for models that exist * Handle subprocess error in git short hash check * Fix filtered models list initialization
* Create bench.py * Various benchmark results * Update benchmark script with hardware name, and file checks * Remove old benchmark results * Add git shorthash * Round to 2 digits on calculated floats * Fix the header reference when sorting results * FIx order of models * Parse file name * Simplify filecheck * Improve print run print statement * Use simplified model name * Update benchmark_results.csv * Process single or lists of processors and threads * Ignore benchmark results, dont check in * Move bench.py to extra folder * Readme section on how to use * Move command to correct location * Use separate list for models that exist * Handle subprocess error in git short hash check * Fix filtered models list initialization
* Create bench.py * Various benchmark results * Update benchmark script with hardware name, and file checks * Remove old benchmark results * Add git shorthash * Round to 2 digits on calculated floats * Fix the header reference when sorting results * FIx order of models * Parse file name * Simplify filecheck * Improve print run print statement * Use simplified model name * Update benchmark_results.csv * Process single or lists of processors and threads * Ignore benchmark results, dont check in * Move bench.py to extra folder * Readme section on how to use * Move command to correct location * Use separate list for models that exist * Handle subprocess error in git short hash check * Fix filtered models list initialization
Simple benchmarking script written in python, just runs the compiled ./main with various settings and parses the results from the output- and saves them as CSV.