Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock in SCSSCacher #21059

Closed
wants to merge 1 commit into from
Closed

Fix deadlock in SCSSCacher #21059

wants to merge 1 commit into from

Conversation

llebout
Copy link

@llebout llebout commented May 20, 2020

Fixes #15794 (comment)

Screenshot Capture - 2020-05-20 - 18-09-19


It seems to me that the lock needs to be released in this return path too, otherwise it would create a deadlock on every next run. Please double check for me, I'm wholly unfamiliar with the code base. Either way, it solves the slowness in the production deployment I maintain.

It would be great if this change could be backported to 18.x and previous releases.

Fixes nextcloud#15794 (comment)

Signed-off-by: Leo Le Bouter <lle-bout@zaclys.net>
@kesselb
Copy link
Contributor

kesselb commented May 20, 2020

It seems to me that the lock needs to be released in this return path too, otherwise it would create a deadlock on every next run.

I don't think so but that code is quite hard to understand. If you see SCSSCacher: Giving up scss caching for ... we never acquired a lock. But that's also the confusing part. I don't see where we try to acquire a lock a second or third time. There is only a check if the variables changed or the file is cache 😕

@llebout
Copy link
Author

llebout commented May 20, 2020

@kesselb Yes, indeed.

Either way this part is buggy so... I'm for recoding this clearly over-engineered piece of code.

@rullzer
Copy link
Member

rullzer commented May 22, 2020

O man that code... yeah it needs to die...

I'm currently a bit puzzled as to why the unlock there fixes things... as it should not be locked at that stage or?

@kesselb
Copy link
Contributor

kesselb commented May 22, 2020

I'm currently a bit puzzled

Me too. Isn't failed to compile and/or save ..../css/results.scss the actual problem?

I added the screenshot from your comment to the post.

@llebout
Copy link
Author

llebout commented May 23, 2020

@rullzer I think it's that if it gets to this point it's likely there's a deadlock and therefore this could release the buggy lock

@llebout
Copy link
Author

llebout commented May 23, 2020

I think this code is responsible for lots of increased latency on each page load because it gets called on every page (I think)

@kesselb
Copy link
Contributor

kesselb commented May 23, 2020

I think this code is responsible for lots of increased latency on each page load because it gets called on every page (I think)

Maybe. SCSSCacher is responsible to compile the scss and theming app customizations together. That should happen once and the result cached. It seems to be (from the screenshot) that this is actually broken (on your setup) and the compile and cache logic is executed for any request.

You are probably right the code is scary but I would try to fix the other issue (disable a custom theme, disable theming app, etc.) first.

@llebout
Copy link
Author

llebout commented May 24, 2020

@kesselb

Hey, well I don't have any theming app installed or any sort of visual customization. Does the Register app count as customization? It adds a Register button on the login page.

# sudo -u www-data php occ app:list
Enabled:
  - accessibility: 1.4.0
  - activity: 2.11.0
  - admin_audit: 1.8.0
  - bruteforcesettings: 1.6.0
  - calendar: 2.0.3
  - cloud_federation_api: 1.1.0
  - comments: 1.8.0
  - contacts: 3.3.0
  - cospend: 0.5.3
  - dav: 1.14.0
  - drawio: 0.9.5
  - federatedfilesharing: 1.8.0
  - federation: 1.8.0
  - files: 1.13.1
  - files_accesscontrol: 1.8.1
  - files_antivirus: 2.4.1
  - files_external: 1.9.0
  - files_mindmap: 0.0.21
  - files_pdfviewer: 1.7.0
  - files_retention: 1.7.0
  - files_rightclick: 0.15.2
  - files_sharing: 1.10.1
  - files_trashbin: 1.8.0
  - files_versions: 1.11.0
  - files_videoplayer: 1.7.0
  - firstrunwizard: 2.7.0
  - forms: 1.1.1
  - groupfolders: 6.0.6
  - keeporsweep: 0.2.1
  - logreader: 2.3.0
  - lookup_server_connector: 1.6.0
  - nextcloud_announcements: 1.7.0
  - notes: 3.3.1
  - notifications: 2.6.0
  - oauth2: 1.6.0
  - onlyoffice: 4.1.4
  - password_policy: 1.8.0
  - passwords: 2020.5.0
  - photos: 1.0.0
  - polls: 1.4.3
  - privacy: 1.2.0
  - provisioning_api: 1.8.0
  - recommendations: 0.6.0
  - registration: 0.4.7
  - serverinfo: 1.8.0
  - settings: 1.0.0
  - sharebymail: 1.8.0
  - spreed: 8.0.9
  - support: 1.1.0
  - survey_client: 1.6.0
  - systemtags: 1.8.0
  - tasks: 0.13.1
  - text: 2.0.0
  - theming: 1.9.0
  - twofactor_backupcodes: 1.7.0
  - twofactor_totp: 4.1.3
  - twofactor_u2f: 5.1.0
  - updatenotification: 1.8.0
  - viewer: 1.2.0
  - workflowengine: 2.0.0
Disabled:
  - deck
  - encryption
  - end_to_end_encryption
  - mail
  - passman
  - user_ldap

I have a really standard setup, no weird fuss. I use apache2 with php7.3-fpm, Ubuntu 18.04. I value long term maintenance and the least effort principle as a sysadmin. I have this instance running since a year or so without any Nextcloud related issue at all for that time, until now. This issue started happening out of nowhere, I assume it can be due to an extension updating, how could I figure out which one is it? I have some cron job to automatically update apps to their latest versions every now and then.

@kesselb
Copy link
Contributor

kesselb commented May 24, 2020

  • core/search/css/results.scss does exist?
  • Is the issue still there if you disable the theming app?

image

  • Are there rows for results.css in filecache table?
  • What memory cache is configured?

@kesselb
Copy link
Contributor

kesselb commented May 24, 2020

I'm able to reproduce the situation locally:

  • 'memcache.distributed' => '\OC\Memcache\Memcached',
  • Set a conditional break point: $file === 'core/search/css/results.scss' at
    $path = explode('/', $root . '/' . $file);
  • php occ maintenance:repair
  • Open Nextcloud in the browser (make sure xdebug is enabled)
  • See the first request for results.css
  • Continue the process until the lock is aquired
  • Open a second Nextcloud tab
  • See the second request for results.css
  • The second request now loops but !$this->variablesChanged() && $this->isCached($fileNameCSS, $app) is never true. There is also no other attempt to aquire a lock.
  1. I still think that something else is broken (because there is no cached css file)
  2. Yep. That code needs to die ;) I mean what is the worst thing that could happen? We generate the same asset twice and one overwrites the other? 🤔

This pull request "fixes" the symptoms and not the problem. Isn't there something else in your logs?

@llebout
Copy link
Author

llebout commented May 24, 2020

Here's my config (it's the one I put in my admin docs, it hasnt been changed manually ever since, but Nextcloud changed the version automatically for example, I also tried raising the log level to see more info but couldnt find anything interesting):

<?php
$CONFIG = array (
  'instanceid' => 'x',
  'passwordsalt' => 'x',
  'secret' => 'x',
  'trusted_domains' => 
  array (
    0 => 'cloud.example.local',
  ),
  'datadirectory' => '/var/www/nextcloud/data',
  'dbtype' => 'pgsql',
  'version' => '18.0.1.3',
  'overwrite.cli.url' => 'https://cloud.example.local/',
  'dbname' => 'nextcloud',
  'dbhost' => 'db.example.local',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'dbuser' => 'nextcloud',
  'dbpassword' => 'x',
  'installed' => true,
  'memcache.local' => '\\OC\\Memcache\\APCu',
  'memcache.distributed' => '\\OC\\Memcache\\Redis',
  'memcache.locking' => '\\OC\\Memcache\\Redis',
  'redis' => 
  array (
    'host' => 'localhost',
    'port' => 6379,
  ),
  'htaccess.RewriteBase' => '/',
  'updater.release.channel' => 'stable',
  'maintenance' => false,
  'theme' => '',
  'loglevel' => 3,
  'mail_from_address' => 'noreply',
  'mail_smtpmode' => 'smtp',
  'mail_smtpport' => '25',
  'mail_sendmailmode' => 'smtp',
  'mail_domain' => 'smtp.example.local',
  'mail_smtphost' => 'localhost',
  'objectstore' => 
  array (
    'class' => '\\OC\\Files\\ObjectStore\\Swift',
    'arguments' => [
      'autocreate' => true,
      'user' => [
        'name' => 'REPLACE_USERNAME',
        'password' => 'REPLACE_PASSWORD',
        'domain' => [
          'name' => 'default',
        ],
      ],
      'scope' => [
        'project' => [
          'name' => 'REPLACE_PROJECT_NAME',
          'domain' => [
            'name' => 'default',
          ],
        ],
      ],
      'tenantName' => 'REPLACE_TENANT_NAME',
      'serviceName' => 'swift',
      'region' => 'GRA',
      'url' => 'https://auth.cloud.ovh.net/v3',
      'bucket' => 'nextcloud',
    ],
  ),
);

And I got nothing else in my logs

How can I query the filecache table?

The integrity checks pass and the scss files do exist

@llebout
Copy link
Author

llebout commented May 24, 2020

So I started reverting the temporary patch I applied to save my users a headache, disabled the theming app, did not experience the slowness, enabled it again, can't experience the slowness either. So it stopped reproducing for now, and nothing related in the logs either.. ugh - few extension updates have happened since I opened this PR, so it may be it.

@kesselb
Copy link
Contributor

kesselb commented May 24, 2020

You might use occ maintenance:repair to clear the scss cache (and probably trigger the problem again).

@llebout
Copy link
Author

llebout commented May 24, 2020

@kesselb

I did that and it did not make it happen again, however, it logged two errors, and threw HTTP 500 then I refreshed a few times more and it started working OK.

[objectstore] Error: OCP\Files\NotFoundException: object urn:oid:157195 not found in object store at <<closure>>

 0. /var/www/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php line 289
    OC\Files\ObjectStore\Swift->readObject("urn:oid:157195")
 1. /var/www/nextcloud/lib/private/Files/Storage/Common.php line 196
    OC\Files\ObjectStore\ObjectStoreStorage->fopen("appdata_octryvs ... e", "r")
 2. /var/www/nextcloud/lib/private/Files/Storage/Wrapper/Wrapper.php line 245
    OC\Files\Storage\Common->file_get_contents("appdata_octryvs ... e")
 3. /var/www/nextcloud/lib/private/Files/Storage/Wrapper/Availability.php line 264
    OC\Files\Storage\Wrapper\Wrapper->file_get_contents("appdata_octryvs ... e")
 4. /var/www/nextcloud/lib/private/Files/Storage/Wrapper/Wrapper.php line 245
    OC\Files\Storage\Wrapper\Availability->file_get_contents("appdata_octryvs ... e")
 5. /var/www/nextcloud/lib/private/Files/Storage/Wrapper/Wrapper.php line 245
    OC\Files\Storage\Wrapper\Wrapper->file_get_contents("appdata_octryvs ... e")
 6. /var/www/nextcloud/apps/files_accesscontrol/lib/StorageWrapper.php line 264
    OC\Files\Storage\Wrapper\Wrapper->file_get_contents("appdata_octryvs ... e")
 7. /var/www/nextcloud/lib/private/Files/View.php line 1162
    OCA\FilesAccessControl\StorageWrapper->file_get_contents("appdata_octryvs ... e")
 8. /var/www/nextcloud/lib/private/Files/View.php line 595
    OC\Files\View->basicOperation("file_get_contents", "/appdata_octryv ... e", ["read"])
 9. /var/www/nextcloud/lib/private/Files/Node/File.php line 56
    OC\Files\View->file_get_contents("/appdata_octryv ... e")
10. /var/www/nextcloud/lib/private/Files/SimpleFS/SimpleFile.php line 90
    OC\Files\Node\File->getContent()
11. /var/www/nextcloud/lib/private/Template/IconsCacher.php line 116
    OC\Files\SimpleFS\SimpleFile->getContent()
12. /var/www/nextcloud/lib/private/Template/SCSSCacher.php line 337
    OC\Template\IconsCacher->setIconsCss(":root{--color-m ... }")
13. /var/www/nextcloud/lib/private/Template/SCSSCacher.php line 182
    OC\Template\SCSSCacher->cache("/var/www/nextcloud/core/css", "7681-a49a-css-variables.css", "css-variables.scss", OC\Files\SimpleFS\SimpleFolder {}, "/core/css")
14. /var/www/nextcloud/lib/private/Template/CSSResourceLocator.php line 110
    OC\Template\SCSSCacher->process("/var/www/nextcloud", "core/css/css-variables.scss", "core")
15. /var/www/nextcloud/lib/private/Template/CSSResourceLocator.php line 62
    OC\Template\CSSResourceLocator->cacheAndAppendScssIfExist("/var/www/nextcloud", "core/css/css-variables.scss")
16. /var/www/nextcloud/lib/private/Template/ResourceLocator.php line 78
    OC\Template\CSSResourceLocator->doFind("css/css-variables")
17. /var/www/nextcloud/lib/private/TemplateLayout.php line 310
    OC\Template\ResourceLocator->find(["css/server","c ... "])
18. /var/www/nextcloud/lib/private/TemplateLayout.php line 215
    OC\TemplateLayout::findStylesheetFiles(["css/server","c ... "])
19. /var/www/nextcloud/lib/private/legacy/template.php line 184
    OC\TemplateLayout->__construct("user", "files")
20. /var/www/nextcloud/lib/public/AppFramework/Http/TemplateResponse.php line 167
    OC_Template->fetchPage({usedSpacePercen ... ]})
21. /var/www/nextcloud/lib/private/AppFramework/Http/Dispatcher.php line 123
    OCP\AppFramework\Http\TemplateResponse->render()
22. /var/www/nextcloud/lib/private/AppFramework/App.php line 125
    OC\AppFramework\Http\Dispatcher->dispatch(OCA\Files\Controller\ViewController {}, "index")
23. /var/www/nextcloud/lib/private/AppFramework/Routing/RouteActionHandler.php line 47
    OC\AppFramework\App::main("OCA\\Files\\Controller\\ViewController", "index", OC\AppFramework\ ... {}, {_route: "files.view.index"})
24. <<closure>>
    OC\AppFramework\Routing\RouteActionHandler->__invoke({_route: "files.view.index"})
25. /var/www/nextcloud/lib/private/Route/Router.php line 299
    call_user_func(OC\AppFramework\ ... {}, {_route: "files.view.index"})
26. /var/www/nextcloud/lib/base.php line 1008
    OC\Route\Router->match("/apps/files/")
27. /var/www/nextcloud/index.php line 38
    OC::handleRequest()

GET /apps/files/
from snip by admin at 2020-05-24T18:25:35+00:00

[index] Error: OCP\Files\NotFoundException: object urn:oid:157195 not found in object store at <<closure>>

 0. /var/www/nextcloud/lib/private/Files/ObjectStore/ObjectStoreStorage.php line 289
    OC\Files\ObjectStore\Swift->readObject("urn:oid:157195")
 1. /var/www/nextcloud/lib/private/Files/Storage/Common.php line 196
    OC\Files\ObjectStore\ObjectStoreStorage->fopen("appdata_octryvs ... e", "r")
 2. /var/www/nextcloud/lib/private/Files/Storage/Wrapper/Wrapper.php line 245
    OC\Files\Storage\Common->file_get_contents("appdata_octryvs ... e")
 3. /var/www/nextcloud/lib/private/Files/Storage/Wrapper/Availability.php line 264
    OC\Files\Storage\Wrapper\Wrapper->file_get_contents("appdata_octryvs ... e")
 4. /var/www/nextcloud/lib/private/Files/Storage/Wrapper/Wrapper.php line 245
    OC\Files\Storage\Wrapper\Availability->file_get_contents("appdata_octryvs ... e")
 5. /var/www/nextcloud/lib/private/Files/Storage/Wrapper/Wrapper.php line 245
    OC\Files\Storage\Wrapper\Wrapper->file_get_contents("appdata_octryvs ... e")
 6. /var/www/nextcloud/apps/files_accesscontrol/lib/StorageWrapper.php line 264
    OC\Files\Storage\Wrapper\Wrapper->file_get_contents("appdata_octryvs ... e")
 7. /var/www/nextcloud/lib/private/Files/View.php line 1162
    OCA\FilesAccessControl\StorageWrapper->file_get_contents("appdata_octryvs ... e")
 8. /var/www/nextcloud/lib/private/Files/View.php line 595
    OC\Files\View->basicOperation("file_get_contents", "/appdata_octryv ... e", ["read"])
 9. /var/www/nextcloud/lib/private/Files/Node/File.php line 56
    OC\Files\View->file_get_contents("/appdata_octryv ... e")
10. /var/www/nextcloud/lib/private/Files/SimpleFS/SimpleFile.php line 90
    OC\Files\Node\File->getContent()
11. /var/www/nextcloud/lib/private/Template/IconsCacher.php line 116
    OC\Files\SimpleFS\SimpleFile->getContent()
12. /var/www/nextcloud/lib/private/Template/SCSSCacher.php line 337
    OC\Template\IconsCacher->setIconsCss(":root{--color-m ... }")
13. /var/www/nextcloud/lib/private/Template/SCSSCacher.php line 182
    OC\Template\SCSSCacher->cache("/var/www/nextcloud/core/css", "7681-a49a-css-variables.css", "css-variables.scss", OC\Files\SimpleFS\SimpleFolder {}, "/core/css")
14. /var/www/nextcloud/lib/private/Template/CSSResourceLocator.php line 110
    OC\Template\SCSSCacher->process("/var/www/nextcloud", "core/css/css-variables.scss", "core")
15. /var/www/nextcloud/lib/private/Template/CSSResourceLocator.php line 62
    OC\Template\CSSResourceLocator->cacheAndAppendScssIfExist("/var/www/nextcloud", "core/css/css-variables.scss")
16. /var/www/nextcloud/lib/private/Template/ResourceLocator.php line 78
    OC\Template\CSSResourceLocator->doFind("css/css-variables")
17. /var/www/nextcloud/lib/private/TemplateLayout.php line 310
    OC\Template\ResourceLocator->find(["css/server","c ... "])
18. /var/www/nextcloud/lib/private/TemplateLayout.php line 215
    OC\TemplateLayout::findStylesheetFiles(["css/server","c ... "])
19. /var/www/nextcloud/lib/private/legacy/template.php line 184
    OC\TemplateLayout->__construct("user", "files")
20. /var/www/nextcloud/lib/public/AppFramework/Http/TemplateResponse.php line 167
    OC_Template->fetchPage({usedSpacePercen ... ]})
21. /var/www/nextcloud/lib/private/AppFramework/Http/Dispatcher.php line 123
    OCP\AppFramework\Http\TemplateResponse->render()
22. /var/www/nextcloud/lib/private/AppFramework/App.php line 125
    OC\AppFramework\Http\Dispatcher->dispatch(OCA\Files\Controller\ViewController {}, "index")
23. /var/www/nextcloud/lib/private/AppFramework/Routing/RouteActionHandler.php line 47
    OC\AppFramework\App::main("OCA\\Files\\Controller\\ViewController", "index", OC\AppFramework\ ... {}, {_route: "files.view.index"})
24. <<closure>>
    OC\AppFramework\Routing\RouteActionHandler->__invoke({_route: "files.view.index"})
25. /var/www/nextcloud/lib/private/Route/Router.php line 299
    call_user_func(OC\AppFramework\ ... {}, {_route: "files.view.index"})
26. /var/www/nextcloud/lib/base.php line 1008
    OC\Route\Router->match("/apps/files/")
27. /var/www/nextcloud/index.php line 38
    OC::handleRequest()

GET /apps/files/
from snip by admin at 2020-05-24T18:25:35+00:00

@llebout
Copy link
Author

llebout commented May 26, 2020

Issue reproduced again today somehow

@@ -175,6 +175,10 @@ public function process(string $root, string $file, string $app): bool {
$retry++;
}
$this->logger->debug('SCSSCacher: Giving up scss caching for '.$lockKey, ['app' => 'core']);

// Cleaning lock
$this->lockingCache->remove($lockKey);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this process basically never locked, so why should it remove?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nickvergessen It's being dead-locked somewhere but I don't know where.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but if multiple requests are running parallel,
one would unlock for the other?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nickvergessen Yes, I do not understand the current code fully, I think it's acknowledged it's not coherent, I have applied that patch in a live environment that fixed the symptoms but not the root cause. I will free time at some point to investigate this further if no one has done it yet..

@MorrisJobke
Copy link
Member

@juliushaertl and I looked into this and #23478 should fix the root cause for the issue. It would then allow to use the newly cached files and properly return.

@MorrisJobke
Copy link
Member

Let's close this here and continue in the linked PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants