Skip to content

Commit

Permalink
Chore: Add hardcoded-records source to profiling script (#339)
Browse files Browse the repository at this point in the history
  • Loading branch information
aaronsteers authored Aug 19, 2024
1 parent f05baf1 commit 80c8cd1
Showing 1 changed file with 23 additions and 3 deletions.
26 changes: 23 additions & 3 deletions examples/run_perf_test_reads.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,16 @@
poetry run python ./examples/run_perf_test_reads.py -e=5 --destination=e2e --no-cache
```
Testing Python CDK throughput:
```bash
# Test max throughput:
poetry run python ./examples/run_perf_test_reads.py -n=2400000 --source=hardcoded --destination=e2e
# Analyze tracing data:
poetry run viztracer --open -- ./examples/run_perf_test_reads.py -e=3 --source=hardcoded --destination=e2e
```
Note:
- The Faker stream ('purchases') is assumed to be 220 bytes, meaning 4_500 records is
approximately 1 MB. Based on this: 25K records/second is approximately 5.5 MB/s.
Expand Down Expand Up @@ -157,6 +167,15 @@ def get_source(
},
)

if source_alias == "hardcoded":
return ab.get_source(
"source-hardcoded-records",
streams=["dummy_fields"],
config={
"count": num_records,
},
)

raise ValueError(f"Unknown source alias: {source_alias}") # noqa: TRY003


Expand Down Expand Up @@ -244,10 +263,11 @@ def main(
type=str,
help=(
"The cache type to use. The `e2e` source is recommended when Docker is available, "
"while the `faker` source runs natively in Python."
"while the `faker` source runs natively in Python. The 'hardcoded' source is "
"similar to the 'e2e' source, but written in Python."
),
choices=["faker", "e2e"],
default="e2e",
choices=["faker", "e2e", "hardcoded"],
default="hardcoded",
)
parser.add_argument(
"--destination",
Expand Down

0 comments on commit 80c8cd1

Please sign in to comment.