About the T5 Architecture #30

ForestsKing · 2024-03-28T12:47:45Z

In my experiments, I have found that Chronos' inference time is significantly related to the prediction length, and not so much to the historical context length. I don't know much about NLP. I'm curious if T5 is an autoregressive architecture similar to GPT, where it has to generate sequentially one by one, or if it can output all the values at once in parallel (with the help of mask). Thanks!

abdulfatir · 2024-03-28T13:50:44Z

T5 is an encoder-decoder transformer while GPT is decoder-only, so they differ in terms of their architecture. However, both models sample autoregressively, so it is expected that the inference time scales with the prediction length.

lostella · 2024-03-28T15:43:20Z

and not so much to the historical context length

That's because the provided context is capped at a pre-configured context length (in the current models this is 512), so anything longer than that won't impact inference speed.

ForestsKing · 2024-03-29T02:44:08Z

and not so much to the historical context length

That's because the provided context is capped at a pre-configured context length (in the current models this is 512), so anything longer than that won't impact inference speed.

My context is shorter than 512. I presume it's because the Decoder is too slow, covering up the time it takes the Encoder to process the context.

ForestsKing · 2024-03-29T02:44:47Z

T5 is an encoder-decoder transformer while GPT is decoder-only, so they differ in terms of their architecture. However, both models sample autoregressively, so it is expected that the inference time scales with the prediction length.

I got it. Thanks for your reply.

lostella · 2024-03-30T17:42:54Z

My context is shorter than 512. I presume it's because the Decoder is too slow, covering up the time it takes the Encoder to process the context.

Were you using the latest version of the code? We used to have an unnecessary padding to full context length (512) which was removed in #25. After the fix, you should see faster inference as your context shrinks significantly below 512

ForestsKing closed this as completed Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the T5 Architecture #30

About the T5 Architecture #30

ForestsKing commented Mar 28, 2024

abdulfatir commented Mar 28, 2024 •

edited

Loading

lostella commented Mar 28, 2024

ForestsKing commented Mar 29, 2024

ForestsKing commented Mar 29, 2024

lostella commented Mar 30, 2024

About the T5 Architecture #30

About the T5 Architecture #30

Comments

ForestsKing commented Mar 28, 2024

abdulfatir commented Mar 28, 2024 • edited Loading

lostella commented Mar 28, 2024

ForestsKing commented Mar 29, 2024

ForestsKing commented Mar 29, 2024

lostella commented Mar 30, 2024

abdulfatir commented Mar 28, 2024 •

edited

Loading