Code change to wp_stream_get_sites causing very large memory usage on clusters with higher number of sites #1270

AndrewJohnsonTR · 2021-08-03T13:40:27Z

Delete the section that is not applicable:

Bug Report

Expected Behavior

class-connector-blogs->get_context_labels() gathers data from blogs on the multisite using a reasonable amount of memory

Actual Behavior

wp_stream_get_sites used to call get_sites() with no args, this defaulted to just 100 sites. This behavior was changed in the newest release. When get_context_labels() switches gets blog details for every site, it is loading the autoloaded option for every site on the cluster, although wp_is_large_network is checked, wp_is_large_network returns false on less than 10,000 sites. We have a multisite with about 2500 sites, and we are seeing memory usage of 500-700 Megabytes of memory. As you know, the default memory limit for PHP in WordPress is 256MB, we have upped the limit to 1GB, but even in that case we are sometimes seeing heartbeats timeout, because on post edit there is a 30 second time limit before a heartbeat is considered a timeout, and even with caching, 500MB takes a fair amount of time to load. Has this functionality been tested on large clusters? And is it necessary functionality? It was only doing the first 100 sites for a long time, so it seems maybe it is just an extra feature? If it is an extra feature is it possible to add another filter to cancel it so that wp_is_large_network doesn't have to be used?

kasparsd · 2021-08-03T14:08:39Z

Thanks for reporting the issue!

wp_stream_get_sites used to call get_sites() with no args, this defaulted to just 100 sites. This behavior was changed in the newest release.

I guess you're referring to #1258 which introduced the change, right?

@dd32 Are you seeing the same issue with high memory usage now that it queries for all sites in the network?

It appears that wp_stream_get_sites() is primary used to generate a dropdown selector of all blog names in the blogs connector:

stream/classes/class-network.php

Lines 433 to 441 in 305583d

    
           // Add all sites. 
        
           foreach ( wp_stream_get_sites() as $blog ) { 
        
           	$blog_data = get_blog_details( $blog->blog_id ); 
        
           	$blogs[ $blog->blog_id ] = array( 
        
           		'label'    => $blog_data->blogname, 
        
           		'disabled' => '', 
        
           	); 
        
           }

and the Mercator connector:

stream/connectors/class-connector-mercator.php

Line 72 in 305583d

$blogs = wp_stream_get_sites();

I'm wondering why is it actually calling get_site() on each item returned by wp_stream_get_sites() because get_sites() already does that https://core.trac.wordpress.org/browser/tags/5.8/src/wp-includes/class-wp-site-query.php#L404

Do you know if get_site() called by get_sites() is what causes it to load all options for each site?

A simply fix would be to add a cache wrapper on the generated blog ID to blog name array.

AndrewJohnsonTR · 2021-08-04T01:07:13Z

Thanks for reporting the issue!

wp_stream_get_sites used to call get_sites() with no args, this defaulted to just 100 sites. This behavior was changed in the newest release.

I guess you're referring to #1258 which introduced the change, right?

@dd32 Are you seeing the same issue with high memory usage now that it queries for all sites in the network?

It appears that wp_stream_get_sites() is primary used to generate a dropdown selector of all blog names in the blogs connector:

stream/classes/class-network.php

Lines 433 to 441 in 305583d

// Add all sites.

foreach ( wp_stream_get_sites() as $blog ) {

$blog_data = get_blog_details( $blog->blog_id );

$blogs[ $blog->blog_id ] = array(

'label' => $blog_data->blogname,

'disabled' => '',

);

}

and the Mercator connector:

stream/connectors/class-connector-mercator.php

Line 72 in 305583d

$blogs = wp_stream_get_sites();

I'm wondering why is it actually calling get_site() on each item returned by wp_stream_get_sites() because get_sites() already does that https://core.trac.wordpress.org/browser/tags/5.8/src/wp-includes/class-wp-site-query.php#L404

Do you know if get_site() called by get_sites() is what causes it to load all options for each site?

A simply fix would be to add a cache wrapper on the generated blog ID to blog name array.

Thank you very much for the reply! I will add that we actually do use the Mercator plugin too, which is probably an important detail to add.

kasparsd added the performance label Aug 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code change to wp_stream_get_sites causing very large memory usage on clusters with higher number of sites #1270

Code change to wp_stream_get_sites causing very large memory usage on clusters with higher number of sites #1270

AndrewJohnsonTR commented Aug 3, 2021

kasparsd commented Aug 3, 2021

AndrewJohnsonTR commented Aug 4, 2021

Code change to wp_stream_get_sites causing very large memory usage on clusters with higher number of sites #1270

Code change to wp_stream_get_sites causing very large memory usage on clusters with higher number of sites #1270

Comments

AndrewJohnsonTR commented Aug 3, 2021

Bug Report

Expected Behavior

Actual Behavior

kasparsd commented Aug 3, 2021

AndrewJohnsonTR commented Aug 4, 2021