Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSC and CDN interaction makes next.js inefficient for highload projects #65335

Open
dankain opened this issue May 3, 2024 · 16 comments
Open
Labels
bug Issue was opened via the bug report template. linear: next Confirmed issue that is tracked by the Next.js team. Performance Anything with regards to Next.js performance.

Comments

@dankain
Copy link

dankain commented May 3, 2024

Link to the code that reproduces this issue

https://codesandbox.io/p/devbox/rsc-test-m43xq4

To Reproduce

  1. Build the application next build
  2. Start the application next start
  3. Navigate to category 1
  4. Navigate to category 2 with hard refresh

Current vs. Expected behavior

Current Behavior

Screenshot 2024-05-03 at 15 07 20

Product 1 appears in category 1 and category 2

Currently they return return identical data, but have a different rsc hash

Expected Behaviour

If the data is the same there should be only one rsc hash.

With a high throughput global ecommerce site I want to cache identical data close to the customer in a CDN. The different rsc hashes mean that I will get CDN cache misses and traffic will have to go back to the server, potentially a distance from the customer and with a slower response time.

Provide environment information

Operating System:
  Platform: darwin
  Arch: arm64
  Version: Darwin Kernel Version 23.4.0: Fri Mar 15 00:10:42 PDT 2024; root:xnu-10063.101.17~1/RELEASE_ARM64_T6000
  Available memory (MB): 65536
  Available CPU cores: 10
Binaries:
  Node: 18.18.0
  npm: 10.1.0
  Yarn: 1.22.19
  pnpm: N/A
Relevant Packages:
  next: 14.2.3 // Latest available version is detected (14.2.3).
  eslint-config-next: 14.1.0
  react: 18.3.1
  react-dom: 18.3.1
  typescript: 5.4.5
Next.js Config:
  output: N/A

Which area(s) are affected? (Select all that apply)

Performance

Which stage(s) are affected? (Select all that apply)

next start (local)

Additional context

Hi I'm trying to work out how I make the RSC requests work with a CDN when self hosting Nextjs. Is there any more info on this subject. I'm working on an ecommerce site with 100,000 products. Those products could appear in numerous product listing pages (search result pages). Each of those PLP pages are given unique URLs for SEO purposes. Take for example the following URLs:

/mens/
/mens/trainers
/mens/trainers/brand
/mens/trainers/brand?facet-price=%3A168

They could all have the same products in. Due to the way the rsc hash is calculated that means I get a different _rsc params on each listing page, even though the contents of that response is exactly the same.

product/299336/?_rsc=1vl30
product/299336/?_rsc=qe3go
product/299336/?_rsc=1vg99
product/299336/?_rsc=1stsw

I have even gone to different areas of the site cart, checkout, order history, all with links back to the same product, they each produce different _rsc params, but still the data returned is identical?

On a high throughput site I want to be able to cache identical data in the CDN close to the customer. At the moment that would be impossible as there would be to many variations on the rsc hash to make the caching effective.

For solutions I think I only have 2 options:

Cache all rsc requests in the CDN - this would end up caching loads of duplicate data and get cache misses when they should be hits
Pass all request through to the Nextjs server. - With this solution I would worry the server would be overloaded at peak periods.
In both cases there would be an extra cost to the client

I'm trying to understand why the rsc has is different when it is always returning identical data? What is the purpose of this hash? As mentioned before could this just be set to _rsc=1? We would also have issues with the Vary header as it is currently returned like this:

Vary: RSC, Next-Router-State-Tree, Next-Router-Prefetch, Next-Url

In this case the Next-Router-State-Tree and Next-Url will be set on where you are coming from and do not necessarily have an impact on the data need for the page we are going to. The Vary header will again have an impact on the CDN

This issue has been rasied in the following discussion thread #59167

NEXT-3327

@dankain dankain added the bug Issue was opened via the bug report template. label May 3, 2024
@github-actions github-actions bot added the Performance Anything with regards to Next.js performance. label May 3, 2024
@dankain dankain changed the title RSC and CDN cache makes the next.js inefficient for the highload projects RSC and CDN interaction makes the next.js inefficient for highload projects May 3, 2024
@dankain dankain changed the title RSC and CDN interaction makes the next.js inefficient for highload projects RSC and CDN interaction makes next.js inefficient for highload projects May 3, 2024
@github-actions github-actions bot added the linear: next Confirmed issue that is tracked by the Next.js team. label May 6, 2024
@samcx
Copy link
Member

samcx commented May 6, 2024

@dankain This is what I get on the latest canary (after doing these exact steps).

  1. Load in next start
  2. Client-side navigate to category 1
  3. Navigate to /category/2

Can you confirm if you are seeing the same on the latest canary? →

CleanShot 2024-05-06 at 13 41 09@2x

@dankain
Copy link
Author

dankain commented May 7, 2024

Yes I have tried on the latest canary and it is the same. You need to go to category 1 and then category 2. Above I just see the navigation to category 2. If you go to both categories you will see there are 2 different rsc requests for the same product data

@dankain
Copy link
Author

dankain commented May 7, 2024

Hi @samcx I have created a video to explain

RSC.CDN.GH65335.mp4

@RedVelocity
Copy link

This is causing issues for me as well, the random rsc string in the end always results in CF CDN cache MISS
image

@ztanner
Copy link
Member

ztanner commented May 20, 2024

I'm trying to understand why the rsc has is different when it is always returning identical data?

In this case the Next-Router-State-Tree and Next-Url will be set on where you are coming from and do not necessarily have an impact on the data need for the page we are going to. The Vary header will again have an impact on the CDN

For some context, the ?_rsc hash is mean to mirror the Vary header. It was added because there are CDNs that don't honor the Vary header (example). If these headers don't change, the query parameter will be the same, signaling that given the same request headers, the response will be the same.

The reason the Vary header exists is because these headers actually can change the response from the server. For example:

  • interception routes rely on the Next-URL header to decide if the request should respond differently (and return RSC data for the intercepting component).
  • partial rendering relies on the next-router-state-tree header from the client to decide if the server should respond with the shared layout data, or only the page data, to avoid over-fetching and improve performance.

This is obviously not a solution to the issue you're describing, but I wanted to provide some color as to why app router is doing this. If those responses are cached, and the RSC data for a page returns a tree that corresponds with the request from a different page, things will start behaving incorrectly. For example, see:

@dankain
Copy link
Author

dankain commented May 21, 2024

Thanks @ztanner , that helps with the context. Partial rendering is a great concept, but if it means I can't use a CDN effectively then that is an issue.

The problem with the next-router-state-tree is it is only something that the server can understand. The state trees for each of my categories and products are all different, only the server knows that they each have the same layout, therefore the transition from category 1 to product 1 or category 2 to product 1 would need identical data, even with partial rendering. Would it be possible to have the same vary header and RSC hash when the data is the same? It would require the client to understand the layouts?

If this is not possible do you have a suggestion on how to deploy next for a global site? Our site is hosted in Europe, but has southern hemisphere customers. Do I now need to somehow push the page building, HTML cache and data cache to an edge location?

@wit221
Copy link

wit221 commented Jun 30, 2024

Experiencing the same issue, wherein identical RSCs produce different hashes when being Link-ed to from two different paths.

Without diving into the details of ?_rsc implementation and limitations, it sounds like a core blocker to achieving the above goal is the fact that both the RSC data and the RSC tree layout information are coupled together within one ?_rsc payload?

If so, is there a world where we decouple them and have, say, a ?_rsc_data and a ?_rsc_tree payload?

On a high level:

  • Category 1, Product 1:
    ?_rsc_data=hash_a
    ?_rsc_tree=hash_b

  • Category 2, Product 1:
    ?_rsc_data=hash_a
    ?_rsc_tree=hash_c

The two paths fetch the same ?_rsc_data=hash_a payload (and it can be cached across paths), but they still fetch different ?_rsc_tree=hash_{b|c}(and it can be cached per path), since the tree infos may be different for each path.

The goal would be to achieve caching of the data part of the _rsc payload across all paths (it's the more expensive one, presumably), and then the tree part of the _rsc payload can be cached per path.

@1kuzus

This comment has been minimized.

@Systemcluster
Copy link

the ?_rsc hash is meant to mirror the Vary header. It was added because there are CDNs that don't honor the Vary header

As a cache-busting workaround for a subset of broken third parties, I don't see a good reason not to offer an option to disable it.

@guillaume-fr
Copy link

With proper CDN/proxy configuration, one should already be able to ignore ?_rsc in CDN cache key and/or include relevant headers. Unfortunately you still have same cache dilution issue or you are breaking everything if you don't include _rsc neither honor Vary header. To actually fix this issue, the framework must allow good cache hit ratio on external cache.

Maybe changing request flow and cache header logic (Vary...) to allow a decent cache hit ratio by default on CDN, proxy and browser. Or maybe documenting cache processing/response construction on Next-URL, next-router-state-tree... so a third party can re-implement that custom logic. If data are required for that processing, Next.js should provide a convenient way to export them (API, build time output...).

@fmnxl
Copy link

fmnxl commented Oct 15, 2024

If I understand correctly, RSC payload not being cached by the CDN nor by Next's caching system means interception routes (and parallel routes) are broken after deployments unless the whole cache is cleared.

I implemented a modal with interception + parallel route, which breaks down into a full page load because the RSC payload hash has changed, and the old hash is no longer served by the new build.

To keep the modal working, I have been clearing the (CDN) cache after each deployment.

@rselimi
Copy link

rselimi commented Dec 3, 2024

Hi @samcx , trying to get an understanding on where you are on this. Could you please give me an update? Trying to understand whether the app router approach is suitable at all or it it's limited to only be used without a CDN. This will help me understand whether I should move (or stick) with the app router or carry on with the pages? Thanks.

@camjackson-bigwx
Copy link

camjackson-bigwx commented Jan 10, 2025

I'm in the same boat I think. Running a high traffic, self-hosted ecommerce site with 100ks of products which are linked to from many different places. CDN caching is crucial - hitting the next.js server should be an absolute last resort.

If I'm understanding things correctly, it seems like the app router essentially breaks all CDN caching because the RSC request/response for a page is slightly different depending on where you're coming from. I'm still processing all the info here but I'm starting to think I will have to stick with the pages router.

@ztanner
Copy link
Member

ztanner commented Jan 10, 2025

Hey folks - we're working on a router refactor that aims to address this issue as well as others related to the current router prefetching behavior. The new implementation will no longer require the ?_rsc header. Andrew shared a bit more about what he's working on here.

We'll share more as we get further along, just know that this is actively being worked on with high priority. We are aiming to improve the cacheability of these responses and improving the transport layer so the router only needs to request information it doesn't already have.

@charlie0077

This comment has been minimized.

@charlie0077

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue was opened via the bug report template. linear: next Confirmed issue that is tracked by the Next.js team. Performance Anything with regards to Next.js performance.
Projects
None yet
Development

No branches or pull requests