Implement page compilation on demand #330

leandrocp · 2023-08-17T16:17:47Z

The main change is getting rid of Code.eval_quoted/3 which was necessary to render the pages since we were storing the template AST in ETS, but it was not scaling well.

The first design for page rendering was compiling a single module containing all page templates but it was causing a huge spike in memory utilization to compile 700+ pages, something around 4gb. The second design was then removing that module and storing all template AST in ETS, which consume only 2.6mb of data and the lookup is incredible fast but it doesn't scale to serve too many requests, especially sudden spikes, each Code.eval_quoted/3 call increases CPU utilization up to 100% faster than we can afford. The third design (this PR) is a mix of both, it will compile a module per page containing the render/1 function that returns a %Phoenix.LiveView.Rendered{} struct and also store page metadata in ETS, including the name of each page module. So the page module is used to store and serve the templates while ETS is an index of pages used to match the request with a page module. The last change required to make this work is compiling the render/1 function lazily, when the app starts the function is empty (returns :not_loaded) then on the first request to that page it will actually recompile the module with the actual implementation, this way there's no memory utilization spike during deployment allowing the app to start, and that recompilation is barely noticeable by users visiting the page, and also Elixir behaves much better compiling modules by demand than compiling all at once.

And get rid of Code.eval_quoted

bcardarella · 2023-08-17T16:28:48Z

What is the expected memory pressure change we should see? Have you validated that this is better?

leandrocp · 2023-08-17T16:33:46Z

What is the expected memory pressure change we should see? Have you validated that this is better?

During deployment:

First design (a single module): ~3 to 4gb memory (the server kept dying before we could measure precisely)
Current design (AST in ETS): 930mb
This PR (ETS + multiple modules): 630mb

After deployment, with the current design it will stabilize on 540mb and with this PR 470mb.

Compiling a couple of modules at once, for example to render the atom.xml feed, doesn't put too much pressure, it's quite fast, and Elixir is capable of freeing up memory after it's done compiling.

leandrocp · 2023-08-17T16:40:57Z

Regarding other metrics, Code.eval_quoted/3 is about 3k slower than a compiled function call. The average HTTP response time was more stable and faster, for example with the current design it would jump to 300ms quickly after a thousand requests and would reach 1.5s at the end of the test, but with a compiled module it kept below 100ms for a longer period and reached 400ms max avg. CPU utilization was also more stable, it reached 93% with 60k requests (didn't reach 100% which helps to not degrade requests) while the current approach reached 100% which is one of the main problems I'm solving with this PR.

leandrocp · 2023-08-17T16:47:35Z

These graphs give a good perspective.

Before:

After (ignore the data before 11:11)

That's a spike test on a shared-1x-cpu@1024MB fly.io instance running for 2m long, 1k simultaneous users, ~62k requests. But note that's a HTTP request, only the dead render. On real life the experience is better because Beacon does patch pages on pages transition.

leandrocp · 2023-08-17T16:54:35Z

And just to be clear, compiling multiple modules eagerly cause the same memory utilization behaviour. It demanded at least 1gb memory and the server crashed, which I think is not viable, that's why I made it on demand. Which is basically how it handles page publishing.

bcardarella · 2023-08-17T17:00:49Z

@leandrocp how does this compare to a baseline? It would be interesting to see what a mix phx.new app looks like in comparison to where Beacon is at?

Also, I assume the vast majority of the memory are due to the blog posts?

leandrocp · 2023-08-17T17:03:12Z

@leandrocp how does this compare to a baseline? It would be interesting to see what a mix phx.new app looks like in comparison to where Beacon is at?

I'll do a comparison and post the results.

Also, I assume the vast majority of the memory are due to the blog posts?

Yes that's correct.

leandrocp · 2023-08-17T19:37:39Z

To compare apples to apples I've created a new Phx app with a simple LiveView returning just "ok" with an empty layout, and the same for Beacon.

Both running on a shared-1x-cpu@1024MB fly.io instance. The spike test ran for 5m up to 1000 simultaneous users.

Baseline maxed out at 397213 requests:

Beacon maxed out at 275043 requests:

That beacon page is running behind basic auth, I couldn't determine how much it impacts the tests. I've executed it multiple times to rule out network inconsistencies and the results were pretty stable.

The previous test was loading test the "/newsletter" page on staging, that layout is larger and that page is using components.

leandrocp · 2023-08-17T22:05:48Z

And here's the same test with the current design. It executed 243063 requests.

261450481-4da572f5-5e98-4a49-8eff-77f2936eb578

APB9785

Looks awesome 🥳 Glad to see this approach worked out

bcardarella · 2023-08-17T23:04:19Z

I suspect a better test may be comparing it to a regular heex template with the same markup as something that is in dockyard.com

leandrocp · 2023-08-17T23:28:30Z

I suspect a better test may be comparing it to a regular heex template with the same markup as something that is in dockyard.com

Excluding pages that query the DB, a candidate for such test would be https://dockyard.com/newsletter

I'm gonna prepare it and post the results tomorrow.

bcardarella · 2023-08-18T00:13:30Z

If we don’t see nearly identical results we should see why

On Thu, Aug 17, 2023 at 7:28 PM Leandro Pereira ***@***.***> wrote: I suspect a better test may be comparing it to a regular heex template with the same markup as something that is in dockyard.com Excluding pages that query the DB, a candidate for such test would be https://dockyard.com/newsletter I'm gonna prepare it and post the results tomorrow. — Reply to this email directly, view it on GitHub <#330 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAEQXBSO6PAG54BV4UIOO3XV2SKTANCNFSM6AAAAAA3UKDYPU> . You are receiving this because you commented.Message ID: ***@***.***>

--

…

---------------------- Brian Cardarella Founder, DockYard

turns out it does add some calls to the stack

leandrocp · 2023-08-18T17:27:17Z

@bcardarella using flame_on I was able to optimize a bit further to increase requests throughput from 275043 to 309795 requests (compared to 397213 from baseline and 243063 from current design). Still using that simple page that returns a "ok" string:

If we don’t see nearly identical results we should see why

I think it's important to define what "nearly identical" means in numbers, for eg with the latest changes we're 22% behind the baseline in terms of requests throughput. There might be more room for improvements but how much is the goal? It's also important to keep in mind that a beacon app will perform more operations than a baseline app with a regular heex template, including page title, meta tags (site, layout, and page), raw schema, dynamic routing, generating the asset path, beacon live data, finding the layout and page modules in memory, and so on. All those operations are either non-existing or static in the baseline app.

In the flame graph below you can see that Beacon is not adding too many calls in the stack:

bcardarella · 2023-08-18T17:30:11Z

oh wait, we're stress testing with dockyard.com? That may be starting a bunch of other things. Sorry to keep throwing things on your plate here but are we able to deploy a Beacon instance on a new phx app? We can put this on Fly and give it all the resources it needs for comparison.

leandrocp · 2023-08-18T17:35:21Z

oh wait, we're stress testing with dockyard.com?

staging (same config as PROD) :)
that's a shared-1x-cpu@1024MB instance

are we able to deploy a Beacon instance on a new phx app?

I think staging is fine, no?
No one is using it, it's reserved for this testing session.

We can put this on Fly and give it all the resources it needs for comparison.

That's doable, but I'm trying to find the highest number possible in that small instance, and also proving that Beacon is resilient. I think there's value in doing so, serving 300k+ requests in a span of 5m without crashing the server while keeping the average HTTP requests reasonable slow (it's fast while it doesn't hit 100% CPU then it gets to ~800ms avg)

bcardarella · 2023-08-18T17:40:01Z

I think staging is fine, no?

It's mostly that if we are comparing this to a mix phx.new app we should eliminate all variables for the best and most accurate results. I figure a new app with a fresh install of Beacon will give us the best picture on that. Or, if we wanted to frame the comparison of an actual in-use production CMS it is OK to leave as dockyard.com. It just depends what you want to convey to the audience. I'd bet that we have some things in dockyard.com that impact performance which are OK for our needs but may not present Beacon to its fullest potential if our baseline is a basic empty app.

leandrocp · 2023-08-18T18:00:35Z

It just depends what you want to convey to the audience.

I believe it's more valuable to present the numbers from dockyard.com than a dummy page or something more static, so it represents how Beacon is performing for real. I'm using staging to avoid taking production down but it should give similar results since the environment is pretty similar.

I'd bet that we have some things in dockyard.com that impact performance

There's is and I've removed some function calls for that particular release, stuff like Logger for example. But there's more code that I can't remove.

I figure a new app with a fresh install of Beacon will give us the best picture on that.

Yes, that would be the best scenario to profile Beacon against a baseline app in order to fine tune it. I can use https://github.com/BeaconCMS/beacon_demo for that kind of test.

So I'll perform some tests with beacon_demo comparing to https://summer-sun-6104.fly.dev/ok (that's the baseline app) to double check I'm not missing any obvious optimization but I think this PR is in the right direction. Do you see any blocking issue this PR?

leandrocp · 2023-08-18T21:22:33Z

Updated results. Running for 5m up to 1000 users on a shared-1x-cpu@1024MB instance.

Baseline is a simple phx app with a liveview and Beacon is https://github.com/BeaconCMS/beacon_demo with a layout and page. Both rendering the same HTML at the end.

bcardarella · 2023-08-20T15:34:01Z

This looks good. The PR brings the closest apples to apples comparison > 90%.

bcardarella · 2023-08-20T15:35:41Z

For the purposes of your ElixirConf presentation it may be worth setting up a basic WP or Drupal site and comparing their perf numbers

leandrocp added 6 commits August 16, 2023 17:12

Compile one module per page

1a2dcf4

And get rid of Code.eval_quoted

WIP: load pages on demand

430f55f

use a genserver to add backpressure on page loading queue

7a31962

reduce module footprint

c03144b

make sure the page is loaded correctly and some cleanup

862603f

fix spec

9da5b3b

leandrocp changed the title ~~Improve page rendering~~ Implement page compilation on demand Aug 17, 2023

leandrocp requested review from TheFirstAvenger and APB9785 August 17, 2023 16:24

leandrocp requested a review from bcardarella August 17, 2023 16:48

APB9785 approved these changes Aug 17, 2023

View reviewed changes

encapsulate router table

57754f3

leandrocp force-pushed the lp/impr-rendering branch from b1ccf7c to 57754f3 Compare August 18, 2023 14:36

leandrocp added 3 commits August 18, 2023 10:39

remove @beacon_path_params

014329c

remove unnucessary call to :crypto.hash/2 and Base.encode16/1

26b2db2

remove unnuecessary Macro.camelize/1 call

e986e7e

turns out it does add some calls to the stack

bcardarella approved these changes Aug 20, 2023

View reviewed changes

leandrocp added 3 commits August 21, 2023 11:16

remove unnecessary code and do some cleanup

53958ed

clean up

e1c0cb8

remove unused assign

593cf7b

leandrocp merged commit c781cab into main Aug 21, 2023

leandrocp deleted the lp/impr-rendering branch August 21, 2023 18:18

AZholtkevych linked an issue Aug 22, 2023 that may be closed by this pull request

Core -> General Improvements: Beacon optimization #338

Closed

leandrocp mentioned this pull request Jan 11, 2024

Revisit CSS compiler lifecycle #392

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement page compilation on demand #330

Implement page compilation on demand #330

leandrocp commented Aug 17, 2023 •

edited

Loading

bcardarella commented Aug 17, 2023

leandrocp commented Aug 17, 2023 •

edited

Loading

leandrocp commented Aug 17, 2023 •

edited

Loading

leandrocp commented Aug 17, 2023 •

edited

Loading

leandrocp commented Aug 17, 2023

bcardarella commented Aug 17, 2023

leandrocp commented Aug 17, 2023

leandrocp commented Aug 17, 2023

leandrocp commented Aug 17, 2023 •

edited

Loading

APB9785 left a comment

bcardarella commented Aug 17, 2023

leandrocp commented Aug 17, 2023

bcardarella commented Aug 18, 2023 via email

leandrocp commented Aug 18, 2023 •

edited

Loading

bcardarella commented Aug 18, 2023

leandrocp commented Aug 18, 2023 •

edited

Loading

bcardarella commented Aug 18, 2023 •

edited

Loading

leandrocp commented Aug 18, 2023 •

edited

Loading

leandrocp commented Aug 18, 2023

bcardarella commented Aug 20, 2023

bcardarella commented Aug 20, 2023

Implement page compilation on demand #330

Implement page compilation on demand #330

Conversation

leandrocp commented Aug 17, 2023 • edited Loading

bcardarella commented Aug 17, 2023

leandrocp commented Aug 17, 2023 • edited Loading

leandrocp commented Aug 17, 2023 • edited Loading

leandrocp commented Aug 17, 2023 • edited Loading

leandrocp commented Aug 17, 2023

bcardarella commented Aug 17, 2023

leandrocp commented Aug 17, 2023

leandrocp commented Aug 17, 2023

leandrocp commented Aug 17, 2023 • edited Loading

APB9785 left a comment

Choose a reason for hiding this comment

bcardarella commented Aug 17, 2023

leandrocp commented Aug 17, 2023

bcardarella commented Aug 18, 2023 via email

leandrocp commented Aug 18, 2023 • edited Loading

bcardarella commented Aug 18, 2023

leandrocp commented Aug 18, 2023 • edited Loading

bcardarella commented Aug 18, 2023 • edited Loading

leandrocp commented Aug 18, 2023 • edited Loading

leandrocp commented Aug 18, 2023

Baseline

Beacon main (current design)

Beacon with this PR changes

bcardarella commented Aug 20, 2023

bcardarella commented Aug 20, 2023

leandrocp commented Aug 17, 2023 •

edited

Loading

leandrocp commented Aug 17, 2023 •

edited

Loading

leandrocp commented Aug 17, 2023 •

edited

Loading

leandrocp commented Aug 17, 2023 •

edited

Loading

leandrocp commented Aug 17, 2023 •

edited

Loading

leandrocp commented Aug 18, 2023 •

edited

Loading

leandrocp commented Aug 18, 2023 •

edited

Loading

bcardarella commented Aug 18, 2023 •

edited

Loading

leandrocp commented Aug 18, 2023 •

edited

Loading