Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code change to wp_stream_get_sites causing very large memory usage on clusters with higher number of sites #1270

Open
AndrewJohnsonTR opened this issue Aug 3, 2021 · 2 comments

Comments

@AndrewJohnsonTR
Copy link

Delete the section that is not applicable:

Bug Report

Expected Behavior

class-connector-blogs->get_context_labels() gathers data from blogs on the multisite using a reasonable amount of memory

Actual Behavior

wp_stream_get_sites used to call get_sites() with no args, this defaulted to just 100 sites. This behavior was changed in the newest release. When get_context_labels() switches gets blog details for every site, it is loading the autoloaded option for every site on the cluster, although wp_is_large_network is checked, wp_is_large_network returns false on less than 10,000 sites. We have a multisite with about 2500 sites, and we are seeing memory usage of 500-700 Megabytes of memory. As you know, the default memory limit for PHP in WordPress is 256MB, we have upped the limit to 1GB, but even in that case we are sometimes seeing heartbeats timeout, because on post edit there is a 30 second time limit before a heartbeat is considered a timeout, and even with caching, 500MB takes a fair amount of time to load. Has this functionality been tested on large clusters? And is it necessary functionality? It was only doing the first 100 sites for a long time, so it seems maybe it is just an extra feature? If it is an extra feature is it possible to add another filter to cancel it so that wp_is_large_network doesn't have to be used?

@kasparsd
Copy link
Contributor

kasparsd commented Aug 3, 2021

Thanks for reporting the issue!

wp_stream_get_sites used to call get_sites() with no args, this defaulted to just 100 sites. This behavior was changed in the newest release.

I guess you're referring to #1258 which introduced the change, right?

@dd32 Are you seeing the same issue with high memory usage now that it queries for all sites in the network?

It appears that wp_stream_get_sites() is primary used to generate a dropdown selector of all blog names in the blogs connector:

// Add all sites.
foreach ( wp_stream_get_sites() as $blog ) {
$blog_data = get_blog_details( $blog->blog_id );
$blogs[ $blog->blog_id ] = array(
'label' => $blog_data->blogname,
'disabled' => '',
);
}

and the Mercator connector:

$blogs = wp_stream_get_sites();

I'm wondering why is it actually calling get_site() on each item returned by wp_stream_get_sites() because get_sites() already does that https://core.trac.wordpress.org/browser/tags/5.8/src/wp-includes/class-wp-site-query.php#L404

Do you know if get_site() called by get_sites() is what causes it to load all options for each site?

A simply fix would be to add a cache wrapper on the generated blog ID to blog name array.

@AndrewJohnsonTR
Copy link
Author

Thanks for reporting the issue!

wp_stream_get_sites used to call get_sites() with no args, this defaulted to just 100 sites. This behavior was changed in the newest release.

I guess you're referring to #1258 which introduced the change, right?

@dd32 Are you seeing the same issue with high memory usage now that it queries for all sites in the network?

It appears that wp_stream_get_sites() is primary used to generate a dropdown selector of all blog names in the blogs connector:

// Add all sites.
foreach ( wp_stream_get_sites() as $blog ) {
$blog_data = get_blog_details( $blog->blog_id );
$blogs[ $blog->blog_id ] = array(
'label' => $blog_data->blogname,
'disabled' => '',
);
}

and the Mercator connector:

$blogs = wp_stream_get_sites();

I'm wondering why is it actually calling get_site() on each item returned by wp_stream_get_sites() because get_sites() already does that https://core.trac.wordpress.org/browser/tags/5.8/src/wp-includes/class-wp-site-query.php#L404

Do you know if get_site() called by get_sites() is what causes it to load all options for each site?

A simply fix would be to add a cache wrapper on the generated blog ID to blog name array.

Thank you very much for the reply! I will add that we actually do use the Mercator plugin too, which is probably an important detail to add.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants