-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Some Benchmarks for get_json_object #10729
base: branch-24.12
Are you sure you want to change the base?
Conversation
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
* limitations under the License. | ||
*/ | ||
|
||
val input = "/data/tmp/SCALE_FROM_JSON" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this file?
val numRows = 3000000 | ||
//val nullProbability = 0.1 | ||
val nullProbability = 0.0001 | ||
val output = "/data/tmp/SCALE_FROM_JSON" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both input and output assume there's a /data/
folder. That may be fine and can be improved upon later, but perhaps a note would be good?
I think it's probably simple to have set of micro benchmarks tied to data gen that in |
FYI: we have related benchmark implemented spark-rapids-jni: NVIDIA/spark-rapids-jni#1952 |
This depends on #10728 and I don't know if this is where or how we want to deal with benchmarks. I am also happy to move this to another repo if we want to.