Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the number of files to avoid inodes exhausting #7595

Closed
JinheLin opened this issue Jun 5, 2023 · 1 comment · Fixed by #7771, #7897 or #7898
Closed

Reduce the number of files to avoid inodes exhausting #7595

JinheLin opened this issue Jun 5, 2023 · 1 comment · Fixed by #7771, #7897 or #7898
Assignees
Labels
component/storage type/enhancement The issue or PR belongs to an enhancement.

Comments

@JinheLin
Copy link
Contributor

JinheLin commented Jun 5, 2023

Enhancement

There is a case that a customer has more than 40000 tables or partitions.
Each table or partition has about 70 - 80 columns.
Since TiFlash will generate at most 5 files (dat, mrk, null.dat, null.mrk, idx) for each column, it will eventually generate 350 - 400 small files for each table/partition in initialization.

In the end, inodes of ext4 filesystem is exhausted.

img_v2_414442d4-d599-401a-9009-e65f6678468g

@JinheLin JinheLin added type/enhancement The issue or PR belongs to an enhancement. component/storage labels Jun 5, 2023
@JinheLin
Copy link
Contributor Author

JinheLin commented Jun 12, 2023

Reducing the number of files of TiFlash can be decomposed into two sub problems.

  1. Reducing the number of files of empty table.
  • When TiFlash creates an empty table, it will create an empty segment which will create many empty files. So, it is import ant to reducing the number of files of empty table.
  • Option 1: we can fix this problem by not creating empty segment when an empty table is created. Creating the first segment only when the first write request is coming.
    • However, if there is a small amount of data is written, the data will only be saved in Delta/PageStorage, but a Stable/DMFile with many empty files will still be created.
  • Option2: creating Stable/DMFile only when it has data.
    • A lot of code assumes that stable is not null, which has a significant impact to codebase.
  1. Merging small files into larger one.
  • Merging packStat, packProperty and meta into one meta data file. This feature is already supported in S3 mode, when STORAGE_FORMAT_CURRENT.dm_file is DMFileFormat::V3. We need to make this feature support op mode.
  • TiFlash will generate up to 5 files (dat, mrk, null. dat, null. mrk, idx) for each column. For simplicity, we can consider merging small files such as mrk, null.dat, null.mrk, and idx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/storage type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
2 participants