Drop crates.textsearchable_index_col from the export #3612

jtgeibel · 2021-05-11T04:05:51Z

This column is postgres specific and is normally populated via a
trigger. The trigger is now enabled during the import so that the column
can be dropped from the export.

This addresses part of what was raised in bullet point 2 of #2078. The large readme column remains because there could be people using that data, but the text search column is redundant.

r? @smarnach
cc @kornelski

pietroalbini

The code changes look good!

I tested locally what the impact of this would be for how long it takes to import a database dump, since the trigger would have to be executed every crate we insert. To test that I downloaded today's dump, removed version_downloads as that takes ages and is not relevant and I made a copy of the dump with the changes of this PR (enabling the trigger and removing the column from the dump):

Status quo: 42.3 seconds (176 MB crates.csv)
With this PR: 56.7 seconds (95 MB crates.csv)

So, with this importing the dump is 34% slower. We're removing 46% of the data in crates.csv though, which will definitely help download size and everyone who parses the csvs manually instead of importing them in PostgreSQL.

Personally I'd say the tradeoff is worth it.

src/tasks/dump_db/dump-import.sql.hbs

This column is postgres specific and is normally populated via a trigger. The trigger is now enabled during the import so that the column can be dropped from the export.

jtgeibel · 2021-05-14T02:51:29Z

Thanks for the analysis of the space/time tradeoff. I've pushed a new commit with your consistency suggestion.

pietroalbini · 2021-05-14T08:54:42Z

@bors r+

bors · 2021-05-14T08:54:43Z

📌 Commit 8fb7d30 has been approved by pietroalbini

bors · 2021-05-14T08:54:50Z

⌛ Testing commit 8fb7d30 with merge e421caa...

bors · 2021-05-14T09:02:38Z

☀️ Test successful - checks-actions
Approved by: pietroalbini
Pushing e421caa to master...

rust-highfive assigned smarnach May 11, 2021

rust-highfive added the S-waiting-on-review label May 11, 2021

pietroalbini approved these changes May 13, 2021

View reviewed changes

src/tasks/dump_db/dump-import.sql.hbs Outdated Show resolved Hide resolved

pietroalbini assigned pietroalbini and unassigned smarnach May 13, 2021

Drop crates.textsearchable_index_col from the export

8fb7d30

This column is postgres specific and is normally populated via a trigger. The trigger is now enabled during the import so that the column can be dropped from the export.

jtgeibel force-pushed the drop-tsvector-from-db-export branch from 0e1a3dc to 8fb7d30 Compare May 14, 2021 02:49

bors merged commit e421caa into rust-lang:master May 14, 2021

pietroalbini mentioned this pull request May 14, 2021

Experimental database dumps changelog #3617

Open

dtolnay mentioned this pull request May 19, 2021

Drop crates.textsearchable_index_col dtolnay/db-dump#2

Merged

rust-lang deleted a comment from rupert1975 Jun 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop crates.textsearchable_index_col from the export #3612

Drop crates.textsearchable_index_col from the export #3612

jtgeibel commented May 11, 2021

pietroalbini left a comment

jtgeibel commented May 14, 2021

pietroalbini commented May 14, 2021

bors commented May 14, 2021

bors commented May 14, 2021

bors commented May 14, 2021

Drop crates.textsearchable_index_col from the export #3612

Drop crates.textsearchable_index_col from the export #3612

Conversation

jtgeibel commented May 11, 2021

pietroalbini left a comment

Choose a reason for hiding this comment

jtgeibel commented May 14, 2021

pietroalbini commented May 14, 2021

bors commented May 14, 2021

bors commented May 14, 2021

bors commented May 14, 2021