Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dropped tables can not be physically removed if syncAllSchema is trigger frequently #9227

Closed
JaySon-Huang opened this issue Jul 12, 2024 · 3 comments

Comments

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Jul 12, 2024

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Run the following SQLs repeatedly

CREATE TABLE if not exists test_range_1 (id INT,name VARCHAR(20))
PARTITION BY RANGE (id) (PARTITION p0 VALUES LESS THAN (10),PARTITION p1 VALUES LESS THAN (20),PARTITION p2 VALUES LESS THAN (30),PARTITION p3 VALUES LESS THAN (40),PARTITION p4 VALUES LESS THAN (50);

CREATE TABLE if not exists test_swap_1 (id INT,name VARCHAR(20));
ALTER TABLE test_range_1 EXCHANGE PARTITION p0 WITH TABLE test_swap_1;
DROP TABLE test_swap_1;

2. What did you expect to see? (Required)

The dropped table can be successfully removed from tiflash instances

3. What did you see instead (Required)

The dropped tables' IStorage instances remain in TiFlash even after the gc_safepoint is exceed than the drop timepoint + gc_lifetime.
More and more ".sql" files of the dropped tables' remain in TiFlash instance disk. TiFlash sync all schema become more and more slower.

4. What is your TiFlash version? (Required)

v6.5.10

@JaySon-Huang
Copy link
Contributor Author

In the previous versions

  1. Repeatedly execute the given SQLs. The rapidly swap and dropped table_id will lead to some warnings like "miss table" or "miss partition table in TiKV" with "Errors::DDL::StaleSchema"

    /// step 3 change partition of the partition table to non partition table
    table_info = getter.getTableInfo(npt_db_info->id, pt_partition_id);
    if (table_info == nullptr)
    {
    LOG_WARNING(log, "Execute exchange partition, the table info of partition can not get from TiKV, npt_database_id={} partition_id={}", npt_database_id, pt_partition_id);
    throw TiFlashException(fmt::format("miss partition table in TiKV, may have been dropped, physical_table_id={}", pt_partition_id), Errors::DDL::StaleSchema);
    }

  2. When tiflash meet exception in tryLoadSchemaDiffs, tiflash will fallback to loadAllSchema to fetch all tables info from TiKV.

    Int64 version_after_load_diff = 0;
    if (version_after_load_diff = tryLoadSchemaDiffs(getter, cur_version, version, context, ks_log); version_after_load_diff == -1)
    {
    GET_METRIC(tiflash_schema_apply_count, type_full).Increment();
    version_after_load_diff = loadAllSchema(getter, version, context);
    }

  3. In SchemaBuilder::syncAllSchema, each table that does not exists in tikv but still in the tiflash instance. TiFlash will execute applyDropPhysicalTable for each table, wich will update the "tombstone_ts" of the table in its ".sql" file.
    Even if the table is already "tombstone", the function will update the "tombstone_ts" by a newer tso from PD. This make the IStorage instance still exist even after the gc_safepoint is exceed than the drop timepoint + gc_lifetime.

    /// Drop all unmapped tables.
    auto storage_map = tmt_context.getStorages().getAllStorage();
    for (auto it = storage_map.begin(); it != storage_map.end(); it++)
    {
    auto table_info = it->second->getTableInfo();
    if (table_info.keyspace_id != keyspace_id)
    {
    continue;
    }
    if (table_set.count(table_info.id) == 0)
    {
    applyDropPhysicalTable(it->second->getDatabaseName(), table_info.id);
    LOG_INFO(log, "Table {}.{} dropped during sync all schemas", it->second->getDatabaseName(), name_mapper.debugTableName(table_info));
    }
    }

    template <typename Getter, typename NameMapper>
    void SchemaBuilder<Getter, NameMapper>::applyDropPhysicalTable(const String & db_name, TableID table_id)
    {
    auto & tmt_context = context.getTMTContext();
    auto storage = tmt_context.getStorages().get(keyspace_id, table_id);
    if (storage == nullptr)
    {
    LOG_DEBUG(log, "table {} does not exist.", table_id);
    return;
    }
    GET_METRIC(tiflash_schema_internal_ddl_count, type_drop_table).Increment();
    LOG_INFO(log, "Tombstoning table {}.{}", db_name, name_mapper.debugTableName(storage->getTableInfo()));
    const UInt64 tombstone_ts = PDClientHelper::getTSO(tmt_context.getPDClient(), PDClientHelper::get_tso_maxtime);
    AlterCommands commands;
    {
    AlterCommand command;
    command.type = AlterCommand::TOMBSTONE;
    // We don't try to get a precise time that TiDB drops this table.
    // We use a more relaxing GC strategy:
    // 1. Use current timestamp, which is after TiDB's drop time, to be the tombstone of this table;
    // 2. Use the same GC safe point as TiDB.
    // In such way our table will be GC-ed later than TiDB, which is safe and correct.
    command.tombstone = tombstone_ts;
    commands.emplace_back(std::move(command));
    }
    auto alter_lock = storage->lockForAlter(getThreadNameAndID());
    storage->alterFromTiDB(alter_lock, commands, db_name, storage->getTableInfo(), name_mapper, context);
    LOG_INFO(log, "Tombstoned table {}.{}, tombstone={}", db_name, name_mapper.debugTableName(storage->getTableInfo()), tombstone_ts);
    }

@seiya-annie
Copy link

/report customer

@JaySon-Huang
Copy link
Contributor Author

Close as it will be fixed in the next patch version of release-6.5 and release-7.1. v7.5 or later are not affected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants