anyscale / metadata-fetching-benchmarks Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Benchmark script for reproducing metadata fetching blog numbers

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
create_data.py		create_data.py
reproduce.py		reproduce.py

Repository files navigation

metadata-fetching-benchmarks

Benchmark script for reproducing metadata fetching blog numbers.

To reproduce the numbers:

You can run create_data.py to create a 1 TiB dataset
Use reproduce.py to read from the dataset.

The blog uses 1 m5.2xlarge head node, and 8 m5.8xlarge worker nodes for the underlying Ray cluster.

About

Benchmark script for reproducing metadata fetching blog numbers

Custom properties

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%