Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize schema managers' ::listTables() methods #4882

Closed
wants to merge 57 commits into from

Conversation

mondrake
Copy link
Contributor

@mondrake mondrake commented Oct 16, 2021

Q A
Type improvement
BC Break no
Fixed issues #2676 #2766

Summary

A new attempt at fixing #2676, which is still an unsolved performance issue for Oracle schema introspection.

Platforms

  • DB2
  • MySql
  • Oracle
  • PostgreSQL
  • SQLite
  • SQLServer

@greg0ire greg0ire requested a review from morozov October 16, 2021 11:55
@morozov
Copy link
Member

morozov commented Oct 17, 2021

Thanks for the patch, @mondrake. I like the idea as such. I've seen a similar approach implemented in a proprietary codebase for exactly the same reason.

The patch looks like a good start but it will require some rework before we can accept it:

  1. There is quite some SQL duplication between the new methods and the existing ones (e.g. getListTableColumnsSQL() → getAllTableColumnsSQL()). This code should be generalized.
  2. OracleSchemaManager shouldn't override listTables(). This logic should be implemented in the base class, and all platforms implement the new methods.
  3. Instead of All in the new method names, it would make sense to use Database for more clarity.
  4. The default implementations that throw the "Not supported" exception should only exist for BC. There should be a migration path defined that will allow replacing them with abstract methods.
  5. Instead of adding new tests for a specific platform, it would be better to reuse the existing cross-platform tests to make sure that each method is tested in all existing scenarios.

Please retarget against 3.2.x since it's a new API.

@mondrake mondrake changed the base branch from 3.1.x to 3.2.x October 17, 2021 09:15
@mondrake mondrake changed the title Optimize OracleSchemaManager::listTables Optimize schema managers' ::listTables() methods Oct 17, 2021
@mondrake
Copy link
Contributor Author

Thanks for feedback @morozov.

I am afraid that generalizing the patch and changing the current process for all platforms is beyond my capabilities, so a helping hand would be appreciated.

Added @morozov's feedback points as tasks in the summary.

@morozov
Copy link
Member

morozov commented Oct 17, 2021

a helping hand would be appreciated.

Are you asking for help in implementing this feature or for guidance on what to do?

The way I see it is:

  1. Declare a new interface and define the new platform methods there. I cannot think of a proper name, it could be decided later. The interface can be deprecated right away since all its methods will become abstract methods in the base platform class in the next major release.
  2. Move the logic implemented in the Oracle schema manager to the base class. Depending on whether the current platform implements this interface, the schema manager should use the new methods or fall back to the old ones.
  3. Get rid of the code duplication in platform methods. In pseudo-code, it might look like this:
    public function getListDatabaseObjectsSQL(string $database): string
    {
       $this->getListObjectsSQL($database, null);
    }
    
    public function getListTableObjectsSQL(string $database, string $table): string
    {
       $this->getListObjectsSQL($database, $table);
    }
    
    private function getListObjectsSQL(string $database, ?string $table): string
    {
        $sql = 'SELECT object_name';
    
        if ($table === null) {
            $sql .= ', table_name';
        }
    
        $sql = ' FROM objects';
    
        if ($table !== null) {
            $sql .= ' WHERE table = ' . $table;
        }
    }

@mondrake
Copy link
Contributor Author

Thanks for guidance @morozov, I will give this a try.

@mondrake mondrake marked this pull request as draft October 20, 2021 07:00
@mondrake mondrake marked this pull request as ready for review October 20, 2021 16:02
@mondrake
Copy link
Contributor Author

@morozov this is going as far as I could get before starting breaking tests that check identicality of SQL strings generated to fetch database assets metadata, for example Doctrine\DBAL\Tests\Platforms\OraclePlatformTest::testReturnsGetListTableColumnsSQL.

Can I get feedback whether this is in the right direction and is it OK to break such tests moving on?

@mondrake mondrake marked this pull request as draft October 20, 2021 16:46
Copy link
Member

@morozov morozov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you're moving in the right direction, @mondrake. As for breaking the tests that cover the generated SQL, I'd say that they do more harm than good. It's fine to temporarily remove the failing ones rather than fix them.

If it turns out that those were the only tests that covered some lines, we'll have to think about replacing them with the functional tests that do not test the SQL. This is how such tests should have been written in the first place.

src/Platforms/DatabaseAsset.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Schema/AbstractSchemaManager.php Outdated Show resolved Hide resolved
src/Schema/AbstractSchemaManager.php Outdated Show resolved Hide resolved
src/Schema/AbstractSchemaManager.php Outdated Show resolved Hide resolved
src/Schema/AbstractSchemaManager.php Outdated Show resolved Hide resolved
tests/Functional/Schema/OracleSchemaManagerTest.php Outdated Show resolved Hide resolved
@mondrake
Copy link
Contributor Author

Fixed some points from last review by @morozov. Posted progress. Still to go.

@mondrake
Copy link
Contributor Author

Also SqlitePlatform can implement DatabaseIntrospectionSQLBuilder.

src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/SqlitePlatform.php Outdated Show resolved Hide resolved
@mondrake mondrake marked this pull request as ready for review October 25, 2021 14:59
Copy link
Member

@morozov morozov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mondrake please rebase on the latest 3.2.x to address the build issue. I haven't checked the actual new queries in detail but it looks like we're getting somewhere. Great work so far!

src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
src/Platforms/OraclePlatform.php Outdated Show resolved Hide resolved
@mondrake
Copy link
Contributor Author

mondrake commented Dec 1, 2021

@morozov I added conversions for all platforms, I know they're rough around the edges, but I worked by reverse engineering and reimplementing. Reviews welcome.

@mondrake mondrake marked this pull request as ready for review December 1, 2021 16:59
@morozov
Copy link
Member

morozov commented Dec 9, 2021

I added conversions for all platforms, I know they're rough around the edges, but I worked by reverse engineering and reimplementing. Reviews welcome

@mondrake sorry, I missed your comment. Thanks a lot! I want to go over the resulting changes once again and see if we can/should generalize anything. Specifically:

  1. The default implementation of AbstractSchemaManager::listTables() seems to be not used any longer.
  2. AbstractSchemaManager::listTableDetails() returns the same kind of results as ::listTables() but now uses a totally different implementation.

My idea is to introduce the following family of interfaces:

interface TableIntrospector
{
    function getTables(string $databaseName, string $tableName): Table;

    /**
     * @return list<Table>
     */
    function getDatabaseTables(string $databaseName): array;
}

interface ColumnIntrospector
{
    /**
     * @return list<Column>
     */
    function getTableColumns(string $databaseName, string $tableName): array;

    /**
     * @return array<string,list<Column>>
     */
    function getDatabaseColumns(string $databaseName): array;
}

// and so on for indexes, foreign keys and options

There will be an implementation of all these interfaces for every platform, and the job of each platform-specific schema manager will be just to build a proper introspector. The logic of building the table or a list of tables from parts should be the same for all platforms and could be implemented once in the base class.

@morozov
Copy link
Member

morozov commented Jan 4, 2022

@mondrake I tried to prototype the new API that I had in mind but it looks like implementing it would require some more unforeseen refactoring. I wouldn't like to delay merging this PR any further, so I think the new API could be introduced later.

But I believe the issues mentioned above still need to be addressed before this PR can be merged:

  1. The default implementation of AbstractSchemaManager::listTables() seems to be not used any longer.
  2. AbstractSchemaManager::listTableDetails() returns the same kind of results as ::listTables() but now uses a totally different implementation.

I.e. we need to implement some common code in the AbstractSchemaManager that uses the new queries consistently across the platforms. Do you think you could try that?

Additionally, instead of merging the upstream branch into yours, please rebase it. Otherwise, it would have to be rebased before the merge, and the same merge conflicts would have to be addressed again.

@mondrake
Copy link
Contributor Author

mondrake commented Jan 5, 2022

Thanks @morozov, I'll give a try but if you're quicker do not hesitate.

Sorry for the mess with branches' merges and rebases, I will git educated one day, eventually :)

@mondrake
Copy link
Contributor Author

mondrake commented Jan 6, 2022

@morozov I changed the listTables() implementation in AbstractSchemaManager to be able to use the new methods and removed the class specific implementations, so to standardize a bit.

If this is OK I will follow up with listTableDetails() and finally rebasing.

@mondrake
Copy link
Contributor Author

mondrake commented Jan 7, 2022

@morozov in order to allow BC, I introduced a new abstract class AbstractDatabaseIntrospectionSchemaManager that extends AbstractSchemaManager, so that all schema managers extend from that one now, while third parties can still extend from AbstractSchemaManager without BC breaks. I am finished - but for a static analysis issue that I cannot understand. A review would be appreciated; I was thinking also to make the selectDatabase* methods return arrays with the results of the query rather than Result objects, that would further streamline the new code in listTables and listTableDetails, what do you think?

*
* @template T of AbstractPlatform
*/
abstract class AbstractDatabaseIntrospectionSchemaManager extends AbstractSchemaManager
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the changes I tried to avoid mentioned in #4882 (comment) are inevitable. Instead of introducing a new proper API we have to introduce a temporary API. This temporary API would have to be broken later in order be replaced with the proper one which is unfortunate.

Probably, it makes sense to continue with the introduction of the Introspector API as I tried earlier. See the PoC:
02608ba...morozov:optimize-list-tables

What stopped me there is that I didn't anticipate at that time that the introspectors will have to depend on the platform in order to convert the type introspected from the database (e.g. VARCHAR) to the DBAL type (e.g. "string").

We have two options here:

  1. Make them accept a platform in the constructor (sort of okay).
  2. Or even better, introduce a new interface that would do only the mapping and accept it instead of the platform. For instance:
    interface TypeMapper
    {
        function mapType(string $type, array $options): Type;
    }
    
    abstract class AbstractPlatform implements TypeMapper
    {
    }

This is as much as I've done here. Do you think you could explore this idea further?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greg0ire if we're taking the route described above, it would be nice to get #5107 wrapped up before we proceed. This would eliminate the need to overhaul the logic of extracting types from the comments.

Do you believe we are in agreement that extraction of types from the comments is no longer necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morozov honestly I'd rather leave to you to take next steps if you do not mind - this is going so far from the original scope that I am afraid I would rather slow down the process now by reiterating on your input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you believe we are in agreement that extraction of types from the comments is no longer necessary?

Yes, but I think we should wait for doctrine/migrations#1227 to be done first, otherwise won't we get a lot of support?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morozov honestly I'd rather leave to you to take next steps if you do not mind - this is going so far from the original scope that I am afraid I would rather slow down the process now by reiterating on your input.

Sounds fair. I can definitely own it as I see it as an important improvement but I cannot commit to any fixed timeline. I'll take over if it works for you, and thanks for taking it much further than originally planned, @mondrake!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context, @greg0ire. It's not a blocker, especially if @mondrake won't have to deal with carrying over this logic to the new API. But at the same, I'm looking forward to seeing it done sooner than later :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All yours @morozov, and thanks to you for your guidance.

@derrabus derrabus changed the base branch from 3.3.x to 3.4.x January 18, 2022 00:49
@mondrake
Copy link
Contributor Author

mondrake commented Feb 4, 2022

Resolved conficts.

@morozov
Copy link
Member

morozov commented Feb 12, 2022

@mondrake I will close this PR and continue working on the improvement in #5268. Resolving conflicts with the upstream by merging is counterproductive (#4882 (comment)).

@morozov morozov closed this Feb 12, 2022
@mondrake
Copy link
Contributor Author

Thanks @morozov. Yes, I have not learnt yet how to rebase properly. I tried but ended up in endless loops of conflicts. Sorry!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants