-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add s3-benchrunner-c for testing aws-c-s3 #2
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some CMake material that I had to modify during my testing. The issue might simply be due to my lack of CMake knowledge. I haven't fully reviewed the PR yet.
"maxRepeatSecs": 600, | ||
"checksum": null, | ||
"filesOnDisk": false, | ||
"tasks": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we discussed it offline, so noting this for future. specifying 1000 actions is rather ugly. maybe we can consider a param something like "repeatActionsNTimes" which would would generate copies of actions for a single run. But to support that we would need a way to separate out actions vars - we could introduce a magic string substitution var that action runner would update, ex "5GiB/%i".
That is too far in the weeds of solving a problem that might not need a solution for now, so happy to keep as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree it's ugly, but there will be a lot of different programs digesting these files, so I want them to be as simple as possible. My one concession to readability was requiring each runner to translate "1GiB" -> 1073741824
, but I'm hesitant to do much more.
Yeah, there's other approaches we could take in the future, like some "manager" program that takes in "fancy" json, but spits out simpler commands that each runner can follow. And all the complexity is embedded in the one manager program
"filesOnDisk": false, | ||
"tasks": [ | ||
{"action": "upload", "key": "5GiB/1", "size": "5GiB"}, | ||
{"action": "upload", "key": "5GiB/2", "size": "5GiB"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s3 performs better if load is spear across prefixes (https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html). we'll probably want a benchmark for that as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔
should I just do like: "5GiB/1/file", "5GiB/2/file", "5GiB/3/file", ...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think that should be 2 separate benchmarks. one to see how much we can max out one partition and another one for spread across partitions
[[noreturn]] void skip(string_view msg) | ||
{ | ||
cerr << "Skipping benchmark - " << msg << endl; | ||
exit(123); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we standardize some common values that those tools can return? for skip, for fail, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, 100%
I'll probably add a README with all the behavior a runner must exhibit at some point...
uint64_t BenchmarkConfig::bytesPerRun() const | ||
{ | ||
uint64_t bytes = 0; | ||
for (auto &&task : tasks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could be fancy with lambdas as well :)
accumulate(tasks.begin(), tasks.end(), 0,
[](uint64_t sum, const Task& t) { return sum + t.size });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disagree: that is so much harder to read
The directory name is redundant:
aws-crt-s3-benchmarks/runners/s3-benchrunner-c
because each runner's subfolder should be something you can open in an IDE.
I'd much rather have an IDE window named "s3-benchrunner-c" than a window named "c".
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.