Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZIP extraction helper for Joomla Update #35388

Merged
merged 60 commits into from
Sep 18, 2021
Merged

ZIP extraction helper for Joomla Update #35388

merged 60 commits into from
Sep 18, 2021

Conversation

nikosdion
Copy link
Contributor

@nikosdion nikosdion commented Aug 26, 2021

Pinging @wilsonge @PhilETaylor @zero-24 Here's something fun for the upcoming weekend!

Summary

This PR modernises and simplifies the server- and client-side code for Joomla Update when applying the update (extracting the update Joomla ZIP file and running the update finalisation code). It also makes the code far more manageable so you can avoid problems like what you had in Joomla 4.0.1.

The following changes have been made with regards to Joomla Update:

  • Replaced restore.php (Akeeba Restore) with a custom extract.php which works similarly but is easier to maintain.
  • Improved security in extract.php.
  • Simplified the JavaScript handling the update. Also converted to vanilla ES6, removing the unnecessary dependency to jQuery.
  • OPcache is reset per .php file written or deleted using the same code introduced in the CMS' File class.

I explain each item individually below.

Test instructions

I have taken care so that this update works when updating from a version of Joomla that contains Akeeba Restore (restore.php) to one that doesn't, as well as updating between versions of Joomla which only use extract.php.

First, build an update package. Assuming the branch is called feature/jupdate-new-restore you need to do the following:

npm ci
npm build:js
cd build
php build.php --remote=feature/jupdate-new-restore --exclude-gzip --exclude-bzip2

IMPORTANT: Joomla's build.php script only works on Linux, macOS and other UNIX systems since it goes through the shell to use standard system tools such as find, git etc. If you are on Windows this has to be run under WSL, MSysGit32 or a similar environment which provides all the *NIX tools used by build.php. I didn't make it this way. I didn't even tough it. That's how I found it!

You will need the generated file build/tmp/packages/Joomla_4.0.3-dev-Development-Update_Package.zip

You will also need a Joomla 4.0.2 site.

Test 1: Old to New

In this test you will confirm that the ‘old’ Joomla Update extraction method with Akeeba Restore still works when updating to a newer version of Joomla which no longer contains it.

  1. Go to System, Update, Joomla
  2. Click on “Update your site by manually uploading the update package.”
  3. Select the Joomla_4.0.3-dev-Development-Update_Package.zip file.
  4. Check the “I've created a backup and my extensions are compatible.” checkbox.
  5. Click on Upload & Install
  6. Enter your username and password and click the button to proceed.
  7. The update installs and the upgrade is finished without a problem.

Make sure that the files administrator/components/com_joomlaupdate/restore.php and administrator/components/com_joomlaupdate/restore_finalisation.php are removed.

Test 2: New to New

In this test we will confirm that the new JavaScript and server-side extraction helper (extract.php) work, i.e. we didn't break Joomla Update (that would suck, considering I wrote its first implementation and all!).

Follow the EXACT same steps as the previous test.

Since you had already updated the code that kicks in doing this update is the new one, using extract.php.

Make sure that the update installs without any errors.

PLEASE TRY THIS ON A TEST SITE ON A COMMERCIAL HOST, IDEALLY ON A SITE THAT IS A CLONE OF A REAL WORLD SITE. DO NOT ONLY TEST ON A BLANK JOOMLA 4.0.2 SITE ON LOCALHOST. This is important! Everything works on localhost. The push comes to shove when we are dealing with real world sites with 3PD extensions of varying degrees of QA and Joomla compatibility on hosts with greatly varying relative performance using Internet connections which may drop packets harder than an overworked Amazon delivery driver tosses packages to your porch.

Good news: you do NOT need to issue an update to Joomla Update

The original Joomla Update uses the files restore.php (Akeeba Restore, does the extraction), restoration.php (transient configuration file) and restore_finalisation.php (post-update finalisation, deletes the files which no longer exist in the new version).

With this PR the respective files are extract.php, update.php and finalisation.php. The change in name is intentional.

For starters, we are updating the site, we are not restoring a backup. The file naming in the original Joomla Update came from the fact that we were using Akeeba Restore, a script used to restore backup archives. Using the new names makes it easier for developers new to Joomla, who were not around Joomla 2.5.1 when Joomla Update was rushed through the door, to understand what is going on.

Moreover, the lack of overlap means that these files will NOT overwrite the files of the previous Joomla Update while the update to the new version takes place. These files will only be removed at the finalisation step. Therefore you can have a clean update from an old to a new version without updating Joomla Update itself first. Neat!

There is a catch, though. The users who have followed the instructions of the Joomla Security Wiki page on .htaccess files, have used my Master .htaccess (which is used in the Joomla wiki) or are using Admin Tools Professional's .htaccess Maker (or something similar) will need to update their .htaccess files before running Joomla Update AFTER installing whichever Joomla version includes this patch. Same goes for NginX configuration and IIS web.config files.

We COULD avoid that by keeping the same names as previously used but a. you still get confusing naming and b. you would need to update Joomla Update before updating Joomla (try saying that three times, fast).

Why things needed to be changed

Let's take things in a bit more detail. It's a long read. Sorry.

Custom ZIP extraction handler instead of Akeeba Restore

Joomla Update was contributed to Joomla 2.5.1 on little more than a moment's notice by yours truly, having forked it off a feature by the same name I had in Admin Tools. When I implemented this feature in Admin Tools it made sense for me to reuse the code I had already written for Akeeba Backup to extract backup archives. Extracting Joomla's update ZIP package was simply a much narrower use case of the more generic use case of extracting a ZIP backup archive.

The problem is that Akeeba Restore does much more than just extract a ZIP archive. It needs to handle multipart archives of different formats which contain large files and need several minutes to hours to extract, it needs to handle .htaccess files, it needs to handle the removal of the installation directory, stealth .htaccess files and much more. All these are irrelevant for Joomla Update. In other words, Joomla Update never needed Akeeba Restore and using the two together is an overkill. It also seems to confuse some people as to why Joomla is using an Akeeba product in the core.

This wouldn't be that bad but for the fact that Akeeba Restore is also very tricky to maintain, especially when you only have it as one big file (in my repository it's several small files which are concatenated when the file is being “built”). This has historically led to small, well-meaning changes causing the Joomla Update to fail miserably. Like what happened most recently with Joomla 4.0.1.

I've been meaning to solve these problems by creating a special version just for the Joomla project, only including a subset of the features of the full-fat version. This is what I did here. Better late than never!

The whole file is one big class and a small “controller” tacked at the end. It's a tiny fraction of Akeeba Restore's code, it's much more maintainable and I can contribute it per the terms of the Joomla Contributor Agreement I signed all those years ago i.e. the Joomla project gets non-exclusive copyright rights under the GPLv2 and the right to change the license to a newer version of the GPL.

Furthermore, since this is a bespoke script for Joomla 4 I have made sure that the code makes use of static typing (compatible with PHP 7.2 or later) instead of the dynamic / implicit typing Akeeba Restore is doomed to use as long as there are servers with a default PHP version in the 5.x range (don't get me started!).

Improved security

Any script which allows extraction of ZIP archives onto an application directory poses an inherent security risk: if an attacker is able to extract an archive of their choosing they can compromise the site. This can be solved by having the path to the archive to be extracted stored in a server-side file. However, this would still allow an attacker to perform a Denial of Service attack by hitting the archive extraction URL repeatedly. The only way to solve this is to “authenticate” requests.

For the authentication part, a randomly generated secret key is written to a server-side file and communicated to the client-side JavaScript that goes through the archive extraction.

The old version of Akeeba Restore which is still used in Joomla Update uses the secret key to derive an AES-128 key and uses AES-128 in CTR (Counter) mode to encrypt a JSON string which is sent to the server-side restore.php file. That file reads the secret key from the server-side file (restoration.php), derives the same AES-128 key and tries to decrypt the information ostensibly sent to it by the client-side application. If the decryption fails or the result is not a valid JSON document an error is returned.

This has two inherent problems.

First, they key derivation function is naive and insecure. The generated AES-128 key is approximately 56 bits strong instead of 128 bits. It also suffers greatly from key collisions.

Second, the very fact that encryption is used for authentication creates an opportunity for a Padding Oracle attack. On a typical server it would take anywhere between a few dozen to several hundred minutes to derive the key used to authenticate requests to restore.php. When that happens the attacker can exploit restore.php to extract an archive of their choosing, even if the archive is stored remotely. A naive mitigation (fail the authentication if the restoration.php file is created more than 90 minutes ago) is in place but it's not enough anymore. PHP 7 and 8 are much faster and hosting services no longer cram thousands of sites on a single server. This makes each request faster which helps perform the Padding Oracle attack more efficiently.

This new file implements more robust mitigations I have already implemented in my own software since late 2017:

  1. Authentication takes place by performing a time-safe comparison of the server-side key against a plaintext key communicated by the client-side. I am INTENTIONALLY not using hashing or any other similar method. Remember that the key is communicated in plain-text to the JavaScript that runs the extraction. If there was a Man In The Middle attack opportunity this is what it would target to subvert the key. No matter if the key is then sent in plain text, encrypted, hashed or used to derive a signature the fact remains that the client-side will need to somehow have it in plain text. Therefore there is no point implementing anything other than a plain text authentication. It's only susceptible to MITM attacks which can be trivially defended against by using a TLS certificate (HTTPS). TLS certificates are now free of charge (e.g. Let's Encrypt) and required (otherwise modern browsers display a big, scary error about the site being insecure). Meanwhile, using a time safe comparison of plaintext passwords removes the Padding Oracle attack opportunity, therefore it's more secure than us using encryption in the past. It's counter-intuitive but it's true.
  2. The server-side code explicitly disallows remotely stored archives. If the archive name or path contains the :// substring we immediately fail the request. This raises the bar of the minimum viable attack opportunity to BOTH MITM AND arbitrary file uploads with a known location and file name AND a window of opportunity in the range of a few seconds it takes for the Joomla update to complete. This gets to the territory of ‘if you can pull this off you deserve to hack me’.
  3. The extraction engine is re-initialised with the ZIP archive location on every intermediate step of extracting the backup archive whereas previously it would trust whatever information was sent from the client-side. Therefore this raises the minimum viable attack bar even higher, also requiring the attacker to be able to write to arbitrary .php files. However, if the attacker has that capability they have already compromised the site thoroughly! The only thing an attacker would gain with that is a Denial of Service but if they have lareyd compromised the site they can do a DoS with a myriad easier and less detectable ways.
  4. The 90 minutes time limit for update.php is still there. This prevents a Denial of Service attack in case an attacker managed to brute-force the (very long, random) password created before applying the update in case the update failed, in which case the update.php file is not removed AND the update ZIP file is also not deleted just yet from the temporary directory. This is more of a failsafe and less of a security feature.

Overall, these changes not only make the code simpler but far more secure as well.

JavaScript simplification

The only reason we needed the convoluted JavaScript in update.js and encryption.js was the old authentication method. Now that this is no longer a concern we can instead move to plain vanilla JSON responses from our ZIP extraction helper and use Joomla's built-in Joomla.Request to communicate with it and parse the responses. This greatly simplifies the client-side of the update, making it maintainable by more developers instead of only those who could understand how encryption worked.

Since I was at it I also removed the dependency to jQuery, rewrote the JavaScript as EcmaScript 6 and fixed a small visual bug which resulted in the progress bar not turning green at the end of the ZIP file extraction.

OPcache reset for .php files

One of the biggest problems with updating Joomla is that the OPcache is not reset per .php file being overwritten or deleted but globally, at the server level. This is problematic for two reasons.

First, there is a delay between resetting OPcache globally and the cache being deleted. More specifically, the cache is not reset until PHP is tearing down the script after it finishes executing. Therefore the restoration finalisation cannot use any core code as there's no guarantee the correct code will even load!

Second, resetting the OPcache globally is a problem on shared servers where this built-in function may not be available or, if it is, causes performance degradation across the entire server. On a commercial host with hundreds of sites this can be detrimental, especially if the various Joomla sites do not update all at the same time.

Since we are now using a bespoke file for Joomla Update we can do some simple post-processing per extracted or deleted file. If the file extracted or deleted has a .php extension and opcache_invalidate is available and the other conditions are met (see the code in the CMS' File class) we'll ask PHP to invalidate this file in the OPcache. Therefore we are resetting OPcache only for the files we are touching during the update, causing a temporary performance degradation against core files at the first few page loads after the upgrade instead of across the entire server. Moreover, opcache_invalidate is applied immediately, meaning that the finalisation file can now use core code if desired.

Further thoughts

I pondered whether we could support tar.gz or tar.bz2 update files as well. The answer is no, we can't.

ZIP files are, to put it simply, a concatenation of file headers containing information about each file and the respective file's data. The data can be compressed, the headers are not. If you are given an offset in the file where a file's header begins you can extract that file and all files after it. This is what allows us to pause the extraction if it's taking too long and restart it in a new page load. This is what allows us to perform the update on a slow server.

Plain tar archives are similar BUT the file contents are never compressed. They were meant as a primitive disk images five or so decades ago. tar.gz and tar.bz2 archives solve the problem of files taking up too much space by compressing the tar archive itself instead of each individual file's contents with gzip or bzip2 respectively. We would need to extract the entire archive, write it to disk and then extract it in a way that allows resumption.

The problem is that the decompression is memory and CPU intensive. You need as much free PHP memory available as the compressed and uncompressed archive plus the overhead of the gzip or bzip2 decompression algorithm. With modern versions of Joomla this is in the order of ~64MB. In practical terms, even a site with 128MB PHP memory limit may run out of memory if it has enough plugins wasting memory and/or debug enabled (remember that DatabaseDriver logs queries and their information in this case, exploding the memory usage). It would also take a lot of time to perform that, so much that you might hit a PHP or server timeout.

This kills the idea of using any kind of compressed TAR archive.

The other idea I pondered is whether we can use bzip2 compression in ZIP files. It is supported by the ZIP standard, alright! However, unlike zlib (implementing gzip), it's not a requirement for running Joomla and there is no guarantee it will be enabled on the server. This means that if we were to use it the update ZIP files would be unusable on a large enough proportion of servers to make it an unrealistic option.

So, ZIP files with gzip (called ‘Deflate’ in the ZIP standard) compression it is.

Finally, this PR does not touch the CLI updater. That runs under the CLI, it is not subject to the same time, memory and CPU usage constraints we need to take into account for the web version of the updater. It works fine. If it works fine, I don't touch it. Fair enough? :)

Documentation changes needed

As mentioned above, the change of the extraction helper's name from restore.php to extract.php necessitates some changes in Joomla's documentation. Moreover, the documentation for Joomla Update currently has no useful troubleshooting information. So please let me rectify that.

Update the .htacces examples page.

At the very least the the .htaccess example page needs to be updated with the following.

Find the following lines:

## Allow Admin Tools Joomla! updater to run
RewriteRule ^administrator/components/com_admintools/restore\.php$ - [L]

and replace them with

## Joomla! Update (core feature) — Joomla versions 2.5.1 through 4.0.2
RewriteRule ^administrator\/components\/com_joomlaupdate\/restore\.php$ - [L]
## Joomla! Update (core feature) — Joomla versions 4.0.3 and later
RewriteRule ^administrator\/components\/com_joomlaupdate\/extract\.php$ - [L]

You should also update that file with more recent code from https://github.com/nikosdion/kyrion-htaccess/blob/kyrion/.htaccess and the changes made to the core .htaccess, e.g. for the core gzipped files. These changes are well outside the scope of this PR and I will not comment on them any further.

Further documentation changes

Joomla Update will tell you to read the documentation if something goes wrong. Enhance the Troubleshooting section in https://docs.joomla.org/J4.x:Updating_from_an_existing_version#Troubleshooting by adding the following information at the top (it's a LONG read but totally worth it if you are desperately stuck).

Joomla Update is a core component which is responsible for determining if there is a newer version of Joomla available for installation of your site, download it (or let you upload it) and install it. It has been available in Joomla since Joomla 2.5.1 and as a third party extension two years prior to that. You can access it at System, Update, Joomla.

The update process consists of several different steps. While every care has been taken to make this process as trouble–free as possible there's always a minuscule chance that something may go wrong, typically due to a very restrictive server configuration or network conditions on a very small minority of sites.

The following troubleshooting instructions are organised by update step to make it easier to find the information you are looking for. Furthermore, it is an exhaustive resource, based on more than a decade of experience troubleshooting all possible (and some borderline impossible) problems with Joomla and extension updates. It lists problems which are extremely unlikely to occur. Don't let its length scare you; you are very unlikely to ever see any of these problems occur.

Determining if updates exist. Joomla will make a request to its update server over HTTPS and download an XML file provided by the Joomla project listing the latest available versions. The update server in use can be determined by going to System, Update, Joomla, clicking on Options and examining which update server is in use. You are recommended to use the Default update server to receive updates to your current major version of Joomla. Use Joomla Next when you want to upgrade your site to the next major version of Joomla — this is best done on a copy of your site to avoid any nasty surprises; not all third party extensions and templates will be compatible across major Joomla versions. The major version of Joomla is the first digit in the Joomla version, before the first dot. For example, the major version of Joomla 4.0.1 is 4.

If Joomla cannot determine that an update is available please check the following:

  • The Joomla Update information is out of date in the database. Go to System, Update, Joomla and click on Check for Updates.
  • The update sites information in Joomla is corrupt. Go to System, Update, Update Sites and click on Rebuild. Then go to System, Update, Joomla and click on Options. Select the Testing update channel. Click on Save & Close. Click on Options again. Select the Default or Joomla Next update channel — depending on your preference — and click on Save & Close.
  • Your host prevents making outbound HTTPS requests at all or restricts them to predefined allowed IP addresses. Please ask them to allow outbound HTTP requests to https://update.joomla.org. This is a CDN, meaning that the exact IP address will be different depending on where the world you are trying to access this URL from. Do tell your host; they will know what to do with this information.
  • Your host may have an outdated SSL library which does not understand the modern TLS certificates used by the Joomla update CDN. Please ask your host about it.

Determining if third party software is compatible with the new version you are about to install. Joomla does not have a magical way of evaluating third party code for compatibility. Its report is based solely on the extension information kept in Joomla's #__extensions table, the update sites provided by the installed extensions and the update information provided by the developers of third party extensions including but not limited to which version of their software is compatible with which version of Joomla.

If the information displayed is incorrect please check the following:

  • What is the minimum stability for extension updates? Go to System, Update, Extensions and click on Options. The ‘Minimum Extension Stability’ determines which is the minimum stability level of a third party software that Joomla will take into account when evaluating compatibility. For example, if this option is set to Stable but only a beta version of a third party extension is compatible with the Joomla version you want to upgrade to Joomla will tell you that there is no compatible version of the third party extension available.
  • The update information may be out of date. Go to System, Update, Extensions and click on Check For Updates. Then go back to System, Update, Joomla and see if the extension now appears as compatible or if you are told than a compatible update to it is available.
  • The update sites information in Joomla is corrupt. Go to System, Update, Update Sites and click on Rebuild. Then go to System, Update, Joomla and click on Options. Select the Testing update channel. Click on Save & Close. Click on Options again. Select the Default or Joomla Next update channel — depending on your preference — and click on Save & Close.
  • Your host prevents making outbound HTTP/HTTPS requests at all or restricts them to predefined allowed IP addresses. This will prevent Joomla from retrieving update information from third party update sites. First go to System, Update, Update Sites. Below each update site you will see its URL. Make a list of those URLs. Then ask your host to allow your site to make requests to these URLs. Please note that some of these URLs may point to a CDN, meaning that the exact IP address will be different depending on where the world you are trying to access this URL from. Do tell your host; they will know what to do with this information.
  • Your host may have an outdated SSL library which does not understand the modern TLS certificates used by most third party extension developers' update sites. Please ask your host about it.
  • You may have “orphaned” extensions. Most modern Joomla extensions are delivered as ‘package’ type extensions which include two or more related extensions. When installing a package extension Joomla records a package extension in its database. It then records the package ID to each of the installed extensions from that package in the database. The update information is provided for the package, not each individual extension. This association may break if you used Discover to install the extensions, extracted the package and installed the separate extensions directly, Joomla failed to record the package ID for each extension when installing the package (most likely because an error occurred) or your site has been upgraded from an old version of Joomla which predates the use of packages in extensions. In this case even updating the extension will NOT update the package association. There is currently no solution to this except determining manually the compatibility of extensions with each Joomla version.

Downloading the update. Joomla will need to download its update package, a ZIP file which is very similar to the Joomla installation ZIP file but without the web installer (the installation directory). This could fail for a few reasons:

  • The ZIP file is rather big. Joomla 4 update packages are around 25MiB. If your server is slow, overwhelmed or has poor connectivity with GitHub — where Joomla update packages are downloaded from — it may take a long time to download the package. If that time is longer than your server's PHP maximum execution time, its maximum CPU usage limit (as determined by ulimit -t), the PHP-FPM timeout or the web server's timeout the download will fail and you will see an error page. You will have to ask your host for help with that.
  • You may not have enough free space on your site. You need enough space to store the compressed update ZIP file and its extracted files. As a rule of thumb, you need about 50–60 MiB of free space for Joomla Update to work correctly. Do note that the free space reported by your hosting control panel is not always realtime, i.e. it may ‘lag’ several minutes or hours behind the actual disk space usage on your site. Moreover, further limits may be imposed by your host. If unsure, please ask your host.
  • Your host prevents making outbound HTTPS requests at all or restricts them to predefined allowed IP addresses. Please ask them to allow outbound HTTP requests to https://github.com. This is a CDN, meaning that the exact IP address will be different depending on where the world you are trying to access this URL from. Do tell your host; they will know what to do with this information.
  • Your host may have an outdated SSL library which does not understand the modern TLS certificates used by GitHub. Please ask your host about it.

Extracting the update. After the update ZIP file has downloaded Joomla needs to extract it on your site. Since Joomla is effectively replacing itself and because this process does take some time to complete it cannot happen within Joomla itself. Instead, a separate file (administrator/components/com_joomlaupdate/extract.php) is used to perform the update. This file is inert except when an update is in progress.

You may get an error during the extraction for one of the following reasons:

  • You cannot access the extract.php file because of a server protection e.g. a customised .htaccess file on Apache and Litespeed servers. Try accessing https://www.example.com/administrator/components/com_joomlaupdate/extract.php from a web browser, where https://www.example.com/ is to be replaced with the URL to your site. You should see the message {"status":false,"message":"Invalid login"}. If you see anything else you are being forbidden from accessing this file.
  • If you can access the file but extracting the update fails immediately there is a different server protection on your site which prevents the request to extract.php from being handled by that file. Please contact your host about this.
  • If you are using CloudFlare go to Rules and create a new Page Rule. Set the If URL Matches to */administrator/components/com_joomlaupdate/extract.php and Then The Settings Are to “Disable Security” and on a new line ”Cache Level”, ‘Bypass’. Set the Order to First. Click on Save and Deploy. This ensures that CloudFlare will not try to block the update extraction.
  • The update ZIP file is corrupt or truncated. This could happen if downloading the update file failed with an error. Go back and retry. See also the previous section.
  • If you are using upload and update i.e. you uploaded the update ZIP file yourself please make sure that you are using only the official Joomla Update ZIP files. The extraction script only supports a subset of the ZIP archive format used by the official update ZIP files.
  • Did you run out of disk space? Please check the section above.
  • Is the ownership and permissions of all files correct? Joomla needs write access to all of its files and folders. If unsure, ask your host. There is no specific set of “good” permissions! The permissions needed depend on the ownership of your files and which system user the web server runs under.
  • Did you lose network connectivity or your network has very high latency? It's possible that the request fails because of that.
  • The extraction takes place by making consecutive requests to the aforementioned extract.php file. Each request is set up to take between 3 and 4 seconds. The process repeats until the entirety of the update file has been extracted. On some servers this cadence of requests to the same file from the same IP address may trigger the server's security. On other servers it may trigger a different server protection, e.g. a maximum PHP time limit, a maximum CPU usage limit or another server timeout. On even fewer servers running on CloudLinux it could trigger a server memory outage situation if your server was already running low on memory. You need to contact your server about that; there is nothing you can do yourself to work around these server limitations.

Finalising the update. This is a two step process.

Right after the update ZIP file has been extracted a final step will run which removes old files. When upgrading to a new major version of Joomla the list of files to remove is pretty big and the process may timeout. Moreover, the point made in the previous section about ownership and permissions of files is important here too; Joomla needs write permissions to the old files and folders it has to remove. If this step fails you can resume it from the command line. Go into the cli folder and run php joomla.php update:joomla:remove-old-files. If you cannot do it yourself ask your host to do it for you. You will also need to follow the workaround for the next step.

Finally, Joomla reloads and you are logged back into the administrator interface. At this point Joomla updates its database tables and performs any database administration tasks. If this fails you can resume the process by going to System, Maintenance, Database. Select Joomla CMS from the list and click on Update Structure.

Postscriptum

The entire PR is one commit. It almost makes it sound trivial. Deriving extract.php from Akeeba Restore was anything but. It took 44 rounds of refactoring, about 24 hours of work crammed into 2 ½ days. I think it was well worth it.

I just hope it has a chance of getting merged. I don't think I have it in me redoing any of that work ever again. That was intense.

@joomla-cms-bot joomla-cms-bot added NPM Resource Changed This Pull Request can't be tested by Patchtester PR-4.0-dev labels Aug 26, 2021
@brianteeman
Copy link
Contributor

question regarding the security part of your post. Would it not be beneficial to check the hash of the zip against the published hashes at the beginning of the process? (note I have yet to read the code so maybe it already does - don't shoot me)

* @package Joomla.Administrator
* @subpackage com_joomlaupdate
*
* @copyright (C) 2016 Open Source Matters, Inc. <https://www.joomla.org>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is a completely new file shouldnt it be 2021?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the copyright from the other files of Joomla Update. I have not found a consistent rule for the copyright of various files. I would appreciate a pointer to the right direction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #31504

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I'll have to add a double copyright since this is derivative work. Okay, thanks for the clarification!

@brianteeman
Copy link
Contributor

@nikosdion could you update your instructions please to say that you cannot run build.php on a windows system

This script is designed to be run in CLI on Linux, Mac OS X and WSL

@PhilETaylor

This comment was marked as abuse.

@nikosdion
Copy link
Contributor Author

@brianteeman Checking the hash is done before running the extraction already. It's meant to ensure that whatever we downloaded is what the update server is telling us we should have downloaded to perform the update. Checking that again before extraction, right after we already checked that, would be a waste of time.

Regarding build.php, dunno, I found it this way 😛 I thought that everyone who'd try this would know that build.php requires a *NIX system. FWIW you can run it on Windows, under WSL. Whether this is a sane thing to do is a different story. An alternative to that is take an existing Joomla 4 update package and replace the administrator/components/com_joomlaupdate and administrator/components/com_admin folders with those from this PR.

@PhilETaylor We've had the experience of writing unit tests for this in the Akeeba Backup repository and one thing we quickly realised is that it's far more complicated than your typical tests and completely nonsensical in context. The code doesn't lend itself to unit testing as it goes and works directly with files. Having it work against data in a memory buffer or stream makes it substantially slower and uses much more memory which makes it fail on cheap shared hosts i.e. exactly where we need it to work (fast hosts could just as well use PHP's ZipArchive and be done with it). One way around that is faking a filesystem in memory e.g. with vfsStream. That's exactly what we had done. Then you need to create doctored ZIP files to simulate error conditions. This requires a hex editor, a copy of the ZIP specification (APPNOTE.TXT), a lot of experience and plenty of time. I actually did some of that and it was... doubleplusunfun.

Then you realise that the one thing you cannot test is the timer code. You can mock it and you can have it return a fake out of time message conditionally... but so what? Have you actually really tested that this will prevent a timeout on most servers? No. That requires something that cannot be tested: experience debugging on these servers and a good sense of how they work both as a complete system and as individual software components. Unfortunately, there's only one of me in Joomla and I've found it impossible to train someone else in this dark art. Davide is working with me for eight years, there are still a few cases each week I have to provide input based on my experience with the dark arts of hosting environments. So, yeah, I could spend the next month writing Unit Tests but they wouldn't really be testing anything useful.

If someone wants to write integration tests that would be FAR more useful. If you have ideas how to do that you're more than welcome to contribute that! I can tell you how I did that for Akeeba Solo which has an updater. I'm creating a new installation of the previous version, I update the updater files (since that's what I'm trying to test) and apply an update created out of the repository's files. At the end of the integration test you can also test that all files have been extracted with the correct name, size and checksum.

Regarding the “Invalid login” it's INTENTIONAL. Before 2014 you would get a different message depending on the actual problem the code ran into. This made a Padding Oracle attack much easier. SO now we're not using encryption why don't I just change these messages? Glad you asked! I am preventing information disclosure which would help an attacker. If you get a different message depending on whether update.php is created, if it's there but has an empty password, if you sent no password or if the password you provided doesn't match you can tell if there's a leftover update.php from a failed update and proceed to brute force it. Showing a generic message makes it much harder for the attacker to know what's going on, meaning they can never be sure if they are brute forcing a password or they are wasting their time.

For code comments I'll reply inline.

@SniperSister
Copy link
Contributor

First of all: thanks a lot @nikosdion for that PR! I've worked with the old restore.php because of my involvement in a Joomla management SAAS and the new file is clearly an update in terms of readability and maintainability!

Just one remark:

There is a catch, though. The users who have followed the instructions of the Joomla Security Wiki page on .htaccess files, have used my Master .htaccess (which is used in the Joomla wiki) or are using Admin Tools Professional's .htaccess Maker (or something similar) will need to update their .htaccess files before running Joomla Update AFTER installing whichever Joomla version includes this patch. Same goes for NginX configuration and IIS web.config files.

I want to highlight that this issue will affect a considerable number of sites. In our own SAAS-context, roughly 15% of the sites are affected and therefore will require a manual adjustment by the site owner.
So, to help those owners, I would suggest to at least add a specific error handling and some documentation around it. Somethink like a specific check for a 403 response in the JS file, triggering a message for the users, pointing them to a proper docs page explaining the issue and giving instructions how to fix it.

@nikosdion
Copy link
Contributor Author

@SniperSister Thank you for confirming my suspicion. I don't have hard numbers, I can only extrapolate from the number of unique Joomla sites that ping our stats server for any of our software and the number of unique sites that ping our site for Admin Tools. My number was about 25%, about half would be using the .htaccess Maker based on experience so we seem to agree. That's good.

My problem is that I am making a PR to Joomla, not the third party code that added the .htaccess / web.config etc change — even though the most likely third party code owner is me. I don't want to put a message that's a self-advertisement inside Joomla.

Moreover, I cannot introduce a new language string because this PR would only make it into 4.1 which might be a long way away yet.

The best thing I can do is that we need to update the Joomla documentation. If the file is blocked you will get a dialog reading “ERROR: AJAX Error”. This means that the browser received an HTTP error response. The first thing to check would be whether you can access /administrator/components/com_joomlaupdate/extract.php. If you can and get a JSON-encoded Invalid Login error the problem with the extraction was something else. If you get a 403 you need to check your .htaccess file (if you're on Apache ro Lighttpd), or your web.config file (if you are on Microsoft IIS) or your NginX configuration. If there's nothing blocking that file there you need to check any CDN configuration or talk to your host. I'm pretty sure a native English speaker can take that and create something easy for people to understand.

I could also make it so that the action after an AJAX Error message is directing the user to the documentation page for Joomla Update. No questions asked, here are the docs, read them. Better than have a message with a link “Click here to read the troubleshooting documentation” which invariably leads to half of the people taking a screenshot of it and asking what to do (click the bloody link is what you should do, dammit!). Sorry, this happens so often in support that it is borderline triggering.

What do you think?

@brianteeman
Copy link
Contributor

Moreover, I cannot introduce a new language string because this PR would only make it into 4.1 which might be a long way away yet.

Not aware of that rule

@nikosdion
Copy link
Contributor Author

@brianteeman I have been told in the past that there's a language freeze before the beta of the x.y.0 version and no language strings are allowed until the next minor version. If I am allowed to add language strings (and I need official confirmation for that) I can definitely make a MUCH MORE USER FRIENDLY error reporting. Like, having a proper modal dialog with actual troubleshooting information instead of a JS popup. I really wanted to do that but without new lang strings this can't fly :)

@richard67
Copy link
Member

@nikosdion The language freeze was between the last RC until 4.0.1, not until 4.1, see #34685 ... so now after these releases all is normal as usual, no limits for language changes.

@joomla-cms-bot joomla-cms-bot added the Language Change This is for Translators label Aug 27, 2021
@nikosdion
Copy link
Contributor Author

@richard67 Thanks for the tips regarding lang strings and the git issue. I will work on better error handling since I now know I can use new lang strings :)

@richard67
Copy link
Member

@nikosdion Regarding the testing scenario "update a J4 with the changes of this PR applied to a later version with the changes still applied": It doesn't really need to create own packages.

You can apply the patch of this PR e.g. with git patch or with the patchtester on a clean, current 4.0-dev and then update to the patched package built by drone for this PRm using the custom update URL of the update package which can be found when expanding the ci checks at the bottom of the PR ("show all checks") and then using the "Details" link at the right hand side of the "Downoad" line.

Since the version of the patched package update has the PR number appended, there will always be found that update even if already being on the latest "*-dev" version.

After such an update, the database checker will and should show only one problem for the CMS about not matching update versions. That is expected but should always be checked after such a test because if there is an error, there will be more problems shown.

@nikosdion
Copy link
Contributor Author

@richard67 Thank you!

@wilsonge I agree. I will make the necessary changes to Admin Tools but won't make a release just yet. I truly appreciate the extra time built into this co–ordination. We're at the second week into our daughter going to pre-school. We've already had the first weekend of all of us being varying degrees of sick with a mild respiratory tract virus she brought back from school 😅

@wilsonge
Copy link
Contributor

@wilsonge I agree. I will make the necessary changes to Admin Tools but won't make a release just yet. I truly appreciate the extra time built into this co–ordination. We're at the second week into our daughter going to pre-school. We've already had the first weekend of all of us being varying degrees of sick with a mild respiratory tract virus she brought back from school 😅

Ouch! Hope you feel better soon and she's enjoying school :)

4.0.4 is scheduled for October 26th just so we have a timescale to work towards! Hopefully will start docs + marketing efforts on this next week once 4.0.3 has bedded in a bit.

@nikosdion
Copy link
Contributor Author

I'm recovering very well, thanks! I am working today on my side of things. I hope the documentation I provided helps. If you need clarification on anything feel free to ask. It's a shame we never thought of writing a troubleshooting guide in the past. Better late than never, right? 😅

@wilsonge wilsonge merged commit 1a569d2 into joomla:4.0-dev Sep 18, 2021
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label Sep 18, 2021
@wilsonge
Copy link
Contributor

Thanks! I'll start on the docs upgrades over the next few days so it's all there for release day

@softforge we need to get marketing kicked off on this I guess from next week.

@nikosdion
Copy link
Contributor Author

Woo-hoo!

Is it OK if I made an Admin Tools release this week, though? I need to provide fixes for other bugs we discovered after I submitted this PR. I will make a note in the release notes that this will be available starting with Joomla 4.0.4 to prevent any misunderstandings. I can even link to this issue so the few people reading the release notes have a clue 😄

@brianteeman
Copy link
Contributor

brianteeman commented Oct 27, 2021

OK let me try and super simplify things.

  1. If you do not have an .htaccess file on your site. - stop reading this does not effect you
  2. If you have never edited the .htaccess file - stop reading this does not effect you.
  3. If your htaccess file does not contain this line - RewriteRule ^administrator/components/com_joomlaupdate/restore\.php$ - [L]) - stop reading this does not effect you
  4. If you are still reading and you are using joomla 3 continue to point 5 else jump to point 6
  5. replace the line with RewriteRule ^administrator\/components\/com_joomlaupdate\/restore\.php$ - [L] - Stop reading
  6. If you are still reading and you are using joomla 4 replace the line with RewriteRule ^administrator\/components\/com_joomlaupdate\/extract\.php$ - [L] - Stop reading
  7. Why are you still reading I told you to stop.

@nikosdion
Copy link
Contributor Author

@davidascherG You make a very large of assumptions, none of which are factually correct.

99.9% of people using Joomla — and definitely everybody who is not technical — need to do ABSOLUTELY NOTHING WHATSOEVER to prepare their sites for this change. Nothing. At. All.

The only people who have to do something are those who have customised their .htaccess beyond the sample htaccess.txt file shipped with Joomla and ONLY IF their customization falls into one of the following categories:

  1. a custom .htaccess file based on the .htaccess Examples (security) page in the Joomla documentation. This is based on an obsolete version of my Master .htaccess from before 2013.
  2. a custom .htaccess file based on the Kyrion .htaccess (formerly Master .htaccess) I have written, i.e. the latest version of the file in the previous category.
  3. a custom .htaccess file created by a third party security extension such as Admin Tools Professional.

Even then, the change is ONLY required if you have applied the advanced server protection rules for the backend which prevent access to any .php file not explicitly allowed in the .htaccess. This is literally something only very advanced users and my clients will have done. Everyone else — especially people who are not developers, systems administrators or power users — are unaffected and need to do ABSOLUTELY NOTHING AT ALL.

Nothing special is required on Windows. I don't know where you got that idea? If you are talking about IIS or NginX the only people affected are those using my software, Admin Tools Professional, to create a custom web.config or NginX configuration respectively.

Nothing is made easier for any third party developer (3PD). In fact, it does NOT affect third party developers at all either positively or negatively. This only has to do with Joomla Update, i.e. how Joomla updates itself. Updates to third party extensions are not affected at all.

The only 3PD affected is me who wrote this PR and I am negatively affected. Not only did I carry the responsibility of updating Joomla Update itself but I also had to bear the responsibility of updating Admin Tools Professional to address the change I made in the Joomla core. I created more work for myself! To make it clear, if you are using Admin Tools Professional and its .htaccess Maker / Web.config Maker / NginX Conf Maker all you need to do is update to the latest version and click the button on your screen to regenerate the .htaccess / web.config / nginx.conf file.

This means that the only people practically affected by this change is about 0.1% of expert Joomla users who maintain their own security–strengthened .htaccess file. This is the target audience of the post made by Joomla: the 0.1% of expert users.

In very simple terms, if you do not understand what to do then there is a 99.9% chance that you need to do ABSOLUTELY NOTHING.

As to why this change was made:

  1. I was sick and tired of people complaining about “Akeeba taking over Joomla” just because Joomla was using a piece of code (Akeeba Restore) I wrote outside of a Joomla contribution in the Joomla Update and respected the license of my code by keeping the copyright notice, just like it does for EACH AND EVERY third party piece of code it uses. No, it makes no sense. Some people are evil like that. In any case, the code in this PR is copyrighted by OpenSourceMatters Inc, Joomla's non–profit company which legally owns the copyright to all of the core code. This makes Joomla more independent.
  2. The version of my code used by Joomla was already 5 years out of date and contained a very low priority security issue I addressed four years ago. In very rare cases when Joomla Update would get stuck an attacker could gain remote code execution privileges on your site after bombarding it with a few hundred thousands requests. The code in this PR does not have this security issue anymore. This makes Joomla more secure.
  3. That old code had a lot of issues with regards to error reporting and overall stability. These were fixed. This makes Joomla more stable.

I spent more than 100 hours of my own time, having my business suffer for it, to help the Joomla community. Half of that time was spent making sure nothing would break on update — a big thank you to everyone who tested and reported issues. We all made sure that the migration to the new updater backend would be seamless and would cover all sorts of upgrade cases without any work required by the overwhelming majority of Joomla users and definitely by those Joomla users who are not very technical. You're welcome, I guess...?

Unfortunately, this experience makes me wary of trying to fix anything else in Joomla Update. I was planning on helping with Joomla Update showing misleading information about third party extensions when updating to a new major release or new version family. I have already identified the problem and reported it. I was asked to fix it. Seeing how people complain about fixing what was broken for years I am not going to spend my time only to be met with hostility for fixing things. It's far easier for me to complain that something is still broken after I have reported exactly what is broken and how to fix it, let someone else spend their time to fix it and be met with hostility for their trouble. This kind of community reaction is why nobody wants to contribute to Joomla.

@brianteeman
Copy link
Contributor

decisiontree

@softforge
Copy link
Contributor

Hi @brianteeman Can we put that in docs and add it to SM, its a great infographic

@richard67
Copy link
Member

@softforge What is “SM” (in this context)?

@softforge
Copy link
Contributor

Social Media

@brianteeman
Copy link
Contributor

No need to ask

@softforge
Copy link
Contributor

Thank you kindly.

@davidascherG
Copy link

To those who have no idea why nick responded to a post from me that you can't see - I deleted the post in question @nikosdion within a few minutes of having written it. I have nothing but the highest respect for nick's efforts in fixing these complicated issues with Joomla Update (and with his excellent Akeeba Backup and Admin Tools extensions). I thought I had put enough disclaimers in my post to make it clear that I had not fully absorbed the issues involved and was probably misunderstanding what it seemed to me would be required of Joomla Admins. To try to avoid muddying the waters, I deleted that post but apparently not fast enough for nick to not see it.

I suppose he got a copy of it sent automatically sent to him as soon as I hit the "Comment" button. I will be much more cautious about hitting that button in the future.

@nikosdion
Copy link
Contributor Author

Correct, I get an email notification on any Pull Request I have submitted. I do not unsubscribe from notifications after they are merged in case someone finds a critical issue we all missed during testing.

Your message was not really the reason I wrote my reply. The Joomla FB group had a long thread about this which was... let's say neither based on fact nor particularly pleasant. Seeing this spill here made me reply.

Beyond what I already commented, I'd like to add two more non–obvious points and call it a day, a week, a month and a trimester.

The leadership wrote the announcement. They ran it by me to ensure technical accuracy. Here is the entirety of my feedback, sent by email, verbatim:

The only comment I have is that it should be called the Joomla Update process, not the restore process.

I think it’s also best not to mention Admin Tools by name because some people are just looking for an excuse to moan that I somehow put all that work into Joomla Update to self–advertise; as if any rational human would put 200+ hours to get a name drop in a pre–release note which will be forgotten in a month, but you know how vile some people are. Best call it “third party Joomla security software which create or modify a site’s .htaccess file”. Those who know, know. Those who don’t know, don’t need to know either.

The technical aspect is spot on.

You can draw your own conclusions 🤷🏽‍♂️

The other point is the other much less reported benefit of the new Joomla Update backend: error reporting and error resolution.

For the past nine years if something didn't work during the update extraction you'd get a JavaScript alert reading “AJAX Error”. That was it. Supremely unhelpful.

With the new backend I am telling you exactly what went went wrong. Even better, I am showing you a practical list of what you need to do to fix the problem. If you know how to use FTP to upload stuff to your server you have the technical competence required for that. For anything more complicated I tell you what to ask your host to do.

I also wrote a most detailed troubleshooting documentation for Joomla Update. Hopefully someone will put it in the docs site; I don't think I have access anymore — or at least I do not seem to have saved a login for it. It's at the top of this PR right now.

I sincerely hope that my contribution does help the community — even those people who complain without having actually used the software yet (it will only be used in the next update, 4.0.4 to 4.0.5).

On a personal note, I came to Mambo in 2004 for the code, I stayed because of the community. I consider the Joomla community to be my second extended family. I care deeply about each and every one of you, regardless of what you think of me. Like all families it's a bit dysfunctional and there's some screaming at each other at times but we still help and respect each other. That's what drives me to write software for Joomla; you're all family to me and I care about you.

image

That said, the recent bout of shouting did take a mental toll on me, much more than I thought it would. I need to take a short break for mental health reasons. I suppose I'll recuperate and regain my resolve before the Joomla 4.1 merge window so I can fix more Joomla bugs. Just not right now. Right now I need to spend some time with my daughter and my wife.

☮️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Required Language Change This is for Translators NPM Resource Changed This Pull Request can't be tested by Patchtester
Projects
None yet
Development

Successfully merging this pull request may close these issues.