Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up inference by avoiding unnecessary padding #25

Merged
merged 1 commit into from
Mar 25, 2024

Conversation

lostella
Copy link
Contributor

@lostella lostella commented Mar 24, 2024

Issue #, if available: Unnecessary context padding slows down inference. We evaluated the models from HF with this change, and found no concerning issue with accuracy.

Test code for a context of length 200:

import torch
from chronos import ChronosPipeline
import time

pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-large",
    device_map="cuda",
    torch_dtype=torch.bfloat16,
)

context = torch.ones((8, 200))
prediction_length = 24
num_runs = 10

t0 = time.time()
for _ in range(num_runs):
    forecast = pipeline.predict(
        context,
        prediction_length,
        num_samples=20,
    )
t1 = time.time()

print(f"total time: {t1 - t0}")

Before the change:

total time: 20.005481481552124

After the change:

total time: 9.82350754737854

Description of changes: Remove padding in case the provided batch is shorter than context_length.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@lostella lostella mentioned this pull request Mar 25, 2024
@lostella lostella merged commit 2875293 into main Mar 25, 2024
2 checks passed
@lostella lostella deleted the remove-unnecessary-padding branch March 25, 2024 11:39
@lostella lostella added the performance improvement Computational performance improvements label Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance improvement Computational performance improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants