[CLEAN] Synthetic Benchmark PR #25719 - Added materialized view and duplicated v2 Tinybird endpoints#231
Open
tomerqodo wants to merge 12 commits intobase_pr_25719_20260121_9919from
Open
Conversation
…re robust polling approach
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Benchmark PR TryGhost#25719
Type: Clean (correct implementation)
Original PR Title: Added materialized view and duplicated v2 Tinybird endpoints
Original PR Description: ref https://linear.app/ghost/issue/NY-865/analytics-sources-not-populating-for-tangle-due-to-408-timeouts
Problem
Tinybird endpoints are timing out in production for the sites with the most data. the
mv_session_datapipe, which almost all our endpoints depend on, calculates complex aggregations at query time, which we believe to be the biggest contributor to poor performance when querying against large data sets.Fix
The solution to fix this is to convert
mv_session_datafrom a pipe to a materialized view, such that Tinybird will calculate these aggregations as ingest time instead of at query time. As this is a rather large change to the implementation, we've opted to create a duplicate v2 pipeline rather than updating the existing pipeline in place. This way we can validate the v2 pipeline in production against real production data, without actually changing any of the user-facing behavior, before cutting over to using the v2 pipeline.Changes made
mv_session_dataCreating a duplicate pipeline makes it easier to validate this in production, but it does make reviewing these changes more difficult, since the git diff doesn't clearly show what has changed in each endpoint. I've added a comment to each file that shows the diff of the v2 endpoint against the original unversioned endpoint to make reviewing this PR easier, but you'll have to run those commands locally with this branch checked out to see the changes.
Testing Ghost against v2 endpoints
This commit adds the ability to point Ghost to the v2 version of the endpoints. To use this, set
tinybird:stats:versiontov2in yourconfig.local.json, then re-runyarn dev:analytics.Original PR URL: TryGhost#25719