-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forum Loader Optimization #45
Comments
I'd just like to say that a lot of that slowdown only came after I moved the pfp loading from the client to the server for some reason :P I'll make a commit (basically) reverting that |
It's probably because the posts are loaded synchronously, so you have to wait about 1-3 seconds for a post to load, and keeping in mind that there are about 10 posts per topic... We need to make it asynchronous. (note: i haven't worked with anything asynchronous in my life, but it is theoretically "faster".) |
I totally forgot |
Who says a human has to do it? Jokes aside, I think making something asynchronous means we need to make the whole project asynchronous. |
That's actually not true. If I remember correctly, the way asyncio works, you can have some parts of your project async and the other synchronous. |
I think the issue comes from having to write every single API response to a cache directory (just in case scratchdb goes down), even if the same response already exists. Asynchronous code doesn't make I/O work faster. I think we could move away from ScratchDB to Scratch's RSS data (the button that looks like a signal). It's more reliable and has newer data. We should use lru_cache for it though |
I'm going to test this, actually, by removing the |
Yep, it's probably more reliable.
The current code loads posts so that the server has to wait for the previous posts to load before loading the next one. |
I found that if you run EDIT:
I will try working on this |
We could also try using Cython, which will compile Python to C which is much faster |
What exactly would we use Cython for? |
After thinking about it, I think we'd need to convert all of Flask to use Cython, so it's probably better to optimize our existing code. My initial thought was that our code would be converted to C and compiled so it would be faster. Correct me if I'm wrong, but I think this would be harder to develop for, because in order to make our code run in C, we have to do that explicitly and that requires special syntax, and most people that would contribute to Snazzle probably don't know this special syntax, therefore making it harder to develop for. Also, somehow I at first confused the capabilities of Cython with those of PyPy. Finally we could also add mypy for type checking which would make our code more type-safe. |
Cython does not make code faster in all cases. It's typically used more for heavy math/statistics computing (such as numpy and pandas)
That, and also that you would need to install a C compiler, which would be Visual Studio on Windows 😭 |
I'm going to use |
RSS only contains the most recent posts, so we can't show all posts from it unfortunately. I don't know what else to use if we want this to be reliable |
We could get data from ScratchDB and then use RSS to top it up with data that ScratchDB hasn't indexed yet. If there's a ScratchDB outage we'll display an alert to the user that all older posts won't be visible until ScratchDB comes back online. |
This is basically a non-issue with the new Svelte port. However, before we discontinue the legacy codebase I think it would be worthwile to refine it a bit. |
What if the pages are loaded at the same time but the posts in those pages are loaded one-by one. This would mean that once 1 page is loaded, every other page is loaded too, not requiring any more processing. By setting the post count per page to 20, we just need to load 20 posts at the same time with others. So if a thread has 20 pages, then first it would load the first post of every page, then the second, the third and so on. We can do this by loading the posts by their ones digit. so we start from 1, which loads the 1 from every thread. then 2, then 3 and so on until 0 (0 comes at last because each page ends with 0 in the ones digit) . Or we can just ditch this and just try to make loading parallel instead of serial (which is my approach). |
bump |
With the release of Snazzle Production Server, bjoern should speed up page loading, but the main bottleneck (when ScratchDB still worked) was getting post data from it. It seems that we just need to make as little HTTP requests as possible to make Snazzle more performant. |
It probably is faster (can't install snazzle 😭) |
It takes like 30 seconds to load one page of a topic
Maybe we should move from file-based archiving to something based in a DB (sqlite
or supabase depending on how my current pr goes)The text was updated successfully, but these errors were encountered: