-
-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: Optimization Request: Improve TTFB by Streaming Rendered HTML #207
Comments
An example of another templating library that makes the above request is quick template. |
Hello there! Originally, templ wrote straight to the In addition, IIRC, writing to the output stream repeatedly (as templ sends strings and other stuff to the network) resulted in a relatively high number of syscalls. Switching context from user space to kernel space in the syscall can slow down the process, and profiling showed that a high amount of time was being spent waiting for these calls. So, the current design is based on the idea of using a pool of buffered writers. My understanding is that this is the technique used by quicktemplate, but I could also be wrong on that! With a buffered writer, templ can write really quickly to a buffer from the pool, and then flush that in fewer syscalls to the network. It can then return the buffer to the pool, after clearing it, reducing GC thrashing. Overall, we found that this rendered HTML much faster. There are some basic benchmarks at https://github.com/a-h/templ/tree/main/benchmarks to test the performance, and the performance of templ is pretty good. The benchmarks don't actually deal with network requests etc, so there's a risk that they're not realistic. However, my thinking with templ was that the biggest impact on TTFB is likely the use of JavaScript as a Server-Side Language in the first place, as per the graph comparing render speed. More speed is good though! 😁 I'd be interested in seeing some load test results of alternative designs with something like Grafana's k6, or hey. The TCP MTU setting is usually much higher on The next actions to progress this are:
Then we'd know for sure. If someone wants to pick that up, that would be great - shout up! In terms of roadmap, the main issues people are complaining about are things like lack of JetBrains IDE support, and rough edges on CSS and script handling, so I want to focus on that in the next few releases (previously, formatting was the major issue, but I think that's solved in the next upcoming release). |
Thanks for the reply. I understand the motivations. With the quicktemplate README, there is mention of this:
This appears to be what |
With your benchmark I created the following quicktemplate:
Added benchmark tests, one with the func BenchmarkQuickTemplateRender(b *testing.B) {
b.ReportAllocs()
person := Person{
Name: "Luiz Bonfa",
Email: "luiz@example.com",
}
w := new(strings.Builder)
for i := 0; i < b.N; i++ {
WriteRenderQT(w, person)
w.Reset()
}
}
func BenchmarkQuickTemplateBufioRender(b *testing.B) {
b.ReportAllocs()
person := Person{
Name: "Luiz Bonfa",
Email: "luiz@example.com",
}
builder := new(strings.Builder)
w := bufio.NewWriter(builder)
for i := 0; i < b.N; i++ {
WriteRenderQT(w, person)
w.Flush()
builder.Reset()
w.Reset(builder)
}
} The output:
The Changing the logic within the |
Thanks for looking into this. If you want to progress this line of thinking, you can directly modify the generated code to test out the concept. For example, you can take the benchmark and adjust it to use a func BufioWriterRender(p Person) templ.Component {
return templ.ComponentFunc(func(templ_7745c5c3_Ctx context.Context, templ_7745c5c3_W io.Writer) (templ_7745c5c3_Err error) {
templ_7745c5c3_Buffer, templ_7745c5c3_IsBuffer := templ_7745c5c3_W.(*bufio.Writer)
if !templ_7745c5c3_IsBuffer {
templ_7745c5c3_Buffer = bufio.NewWriter(templ_7745c5c3_W)
}
templ_7745c5c3_Ctx = templ.InitializeContext(templ_7745c5c3_Ctx)
templ_7745c5c3_Var1 := templ.GetChildren(templ_7745c5c3_Ctx)
if templ_7745c5c3_Var1 == nil {
templ_7745c5c3_Var1 = templ.NopComponent
}
templ_7745c5c3_Ctx = templ.ClearChildren(templ_7745c5c3_Ctx)
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("<div><h1>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var2 string = p.Name
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var2))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("</h1><div style=\"font-family: 'sans-serif'\" id=\"test\" data-contents=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(`something with "quotes" and a <tag>`))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("\"><div>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Var3 := `email:`
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ_7745c5c3_Var3)
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("<a href=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var4 templ.SafeURL = templ.URL("mailto: " + p.Email)
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(string(templ_7745c5c3_Var4)))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("\">")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var5 string = p.Email
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var5))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("</a></div></div></div><hr")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
if true {
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(" noshade")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("><hr optionA")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
if true {
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(" optionB")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(" optionC=\"other\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
if false {
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(" optionD")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("><hr noshade>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
return templ_7745c5c3_Err
})
} func BenchmarkTemplBufioWriterRender(b *testing.B) {
b.ReportAllocs()
t := BufioWriterRender(Person{
Name: "Luiz Bonfa",
Email: "luiz@example.com",
})
w := new(strings.Builder)
for i := 0; i < b.N; i++ {
err := t.Render(context.Background(), w)
if err != nil {
b.Errorf("failed to render: %v", err)
}
w.Reset()
}
} This turns out to be slower than what we have at the moment, probably because it causes an additional allocation. Not sure if it could use a buffered pool.
To put this into perspective, we're talking about shaving up to 10,000 ns off the time to first byte. But, if my maths is right, at 3Mbps it takes around 10,000 ns to transfer a single character - i.e. it would probably be a greater performance improvement to improve whitespace stripping to remove a single additional character than to focus on this. An additional point of note is that the HTTP response writer is already buffered (4KB by default), so if the response is less than 4KB in size, it will wait until it's all rendered anyway. That's why I think a HTTP performance benchmark is probably more useful overall than these low level ones. |
So, I couldn't help but look into this and created a little benchmark of actual web performance at https://github.com/a-h/templ/tree/http_benchmark I updated the test benchmark template and added 1000 extra lines of stuff. package main
import (
"net/http"
"github.com/a-h/templ"
)
type Person struct {
Name string
Email string
}
func main() {
t := Render(Person{
Name: "Luiz Bonfa",
Email: "luiz@example.com",
})
http.ListenAndServe("localhost:8080", templ.Handler(t))
}
The resp wait (the bit we care about in this issue) was:
So... I hacked the generated code to try out the concept. First, I added some extra functions to templ: var bufferedWriterPool = sync.Pool{
New: func() any {
return new(bufio.Writer)
},
}
func GetBufferedWriter(w io.Writer) *bufio.Writer {
bw := bufferedWriterPool.Get().(*bufio.Writer)
bw.Reset(w)
return bw
}
func ReleaseBufferedWriter(b *bufio.Writer) {
b.Reset(nil)
bufferedWriterPool.Put(b)
} Then I updated the generated code to use it: // Code generated by templ@0.2.419 DO NOT EDIT.
package main
//lint:file-ignore SA4006 This context is only used if a nested component is present.
import "github.com/a-h/templ"
import "context"
import "io"
import "bufio"
func Render(p Person) templ.Component {
return templ.ComponentFunc(func(templ_7745c5c3_Ctx context.Context, templ_7745c5c3_W io.Writer) (templ_7745c5c3_Err error) {
templ_7745c5c3_Buffer, templ_7745c5c3_IsBuffer := templ_7745c5c3_W.(*bufio.Writer)
if !templ_7745c5c3_IsBuffer {
templ_7745c5c3_Buffer = templ.GetBufferedWriter(templ_7745c5c3_W)
defer templ.ReleaseBufferedWriter(templ_7745c5c3_Buffer)
}
templ_7745c5c3_Ctx = templ.InitializeContext(templ_7745c5c3_Ctx)
templ_7745c5c3_Var1 := templ.GetChildren(templ_7745c5c3_Ctx)
if templ_7745c5c3_Var1 == nil {
templ_7745c5c3_Var1 = templ.NopComponent
}
templ_7745c5c3_Ctx = templ.ClearChildren(templ_7745c5c3_Ctx)
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("<html><head><title>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Var2 := `Test page`
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ_7745c5c3_Var2)
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("</title></head><body><div><h1>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var3 string = p.Name
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var3))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("</h1><div style=\"font-family: 'sans-serif'\" id=\"test\" data-contents=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(`something with "quotes" and a <tag>`))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("\"><div>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Var4 := `email:`
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ_7745c5c3_Var4)
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("<a href=\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var5 templ.SafeURL = templ.URL("mailto: " + p.Email)
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(string(templ_7745c5c3_Var5)))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("\">")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
var templ_7745c5c3_Var6 string = p.Email
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ.EscapeString(templ_7745c5c3_Var6))
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("</a></div></div></div><hr")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
if true {
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(" noshade")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("><hr optionA")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
if true {
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(" optionB")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(" optionC=\"other\"")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
if false {
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(" optionD")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("><hr noshade>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
for i := 0; i < 1000; i++ {
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("<p>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
templ_7745c5c3_Var7 := `Adding some fake content.`
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString(templ_7745c5c3_Var7)
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("</p>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
}
_, templ_7745c5c3_Err = templ_7745c5c3_Buffer.WriteString("</body></html>")
if templ_7745c5c3_Err != nil {
return templ_7745c5c3_Err
}
if !templ_7745c5c3_IsBuffer {
templ_7745c5c3_Err = templ_7745c5c3_Buffer.Flush()
}
return templ_7745c5c3_Err
})
} And... it was slower:
By adding the loop to add 1000 elements, it turns it into a 32KB file, so I think it's big enough to benefit from buffering. So, from these figures, it looks like in real world usage using a For completeness, I updated the test to instead of doing 1000 copies of the Here's the figures from
And here's the current implementation that uses
Here it's even more of a win for the If there's an alternative to Any ideas @jtarchie |
I wonder if giving the buffers an initial size that's a bit bigger than default would affect the performance much? |
Yes. On my machine There are still downsides to using buffer - say you have many templates with 100kb data and a single one with 200mb of data. All of your buffers will eventually grow to 200mb size, unless you collect statistics like bytebufferpool does https://github.com/valyala/bytebufferpool/blob/master/pool.go
|
Have marked as needs decision, as it seems we have all the information now to decide if we want to implement anything for this. |
Based on the data, I think it makes sense to use a sync buffered write pool for this. Looks like the buffered writing might result in 146k rps vs 140k rps, or around 4% improvement according to @kaey's benchmarks. However, I'm concerned about the potential for RAM growth over time as outlined by @kaey. Having this happen to you violates the principle of least surprise. For runtime stuff, I want to stick to the standard library as much as possible, since it benefits from the security focus of the Go project as a whole, so although the bytebufferpool sounds like a good library to use, I'd prefer to stick to the standard library for runtime if possible. However, I don't plan to work on this in the short-ish future, because CSS media queries, improvements to LSP testing, and providing more documentation / examples would be higher up the list in priority for me at the moment. |
See comment on #781 (reply in thread) - I think this could be a way forward. @joerdav is working on something that affects the generator (automatic imports), so I don't want to implement this until he's finished, but I think it's relatively clear how to proceed. |
Background: The changes introduced in Issue #56 have an unintentional side effect related to a rendered template's TTFB (Time To First Byte).
Problem: With the latest update, the generated Go code first writes the rendered HTML to a buffer and only then writes to the final
Writer
. This means the entire template has to be rendered before any data is sent to the user. This causes the TTFB to become the time taken for the complete server-side page rendering rather than the time to generate the first portion of HTML. This delayed TTFB could negatively impact SEO scores.Example:
For the above template, the first piece of HTML (i.e.,
<div>
) could theoretically be sent immediately, allowing browsers to start rendering sooner. But with the current approach, nothing gets sent until the entire template, including</div>
, is rendered and buffered.Request: While the latest changes have been excellent from an ergonomic standpoint, it would be beneficial to either change the current behavior or provide an option to optimize the TTFB by allowing for streaming of the rendered HTML.
The text was updated successfully, but these errors were encountered: