Parallelize table generation #203

sfc-gh-cnivera · 2024-10-30T20:51:24Z

Investigating getting this working in SiS

When generating a semantic model, we currently iterate through all of the tables that the user has specified and generate a representation for each. This is done sequentially, which leads to long latencies. This PR parallelizes the table generation which drastically reduces generation time.

For example, here are the generation times for a large semantic model containing 20 tables from COVID19_EPIDEMIOLOGICAL_DATA.

Sequential processing:

2024-10-30 13:54:26.497 | INFO - Time to generate semantic model: 187.59221196174622 seconds.

Concurrent processing:

2024-10-30 13:49:39.478 | INFO - Time taken to generate semantic model: 29.945907831192017 seconds.

sfc-gh-cnivera · 2024-10-30T20:53:11Z

semantic_model_generator/generate_model.py

-        fqn_table = create_fqn_table(table)
-        fqn_databse_schema = f"{fqn_table.database}.{fqn_table.schema_name}"
-
-        if fqn_databse_schema not in unique_database_schema:


I don't think we were using unique_database_schema anywhere, so I removed it

sfc-gh-cnivera added 3 commits October 30, 2024 13:46

use concurrent futures

357151d

comments

36898b3

use wait()

8dac1b6

sfc-gh-cnivera commented Oct 30, 2024

View reviewed changes

sfc-gh-cnivera marked this pull request as ready for review October 30, 2024 21:01

sfc-gh-cnivera requested review from sfc-gh-rehuang and sfc-gh-jsummer as code owners October 30, 2024 21:01

sfc-gh-cnivera marked this pull request as draft October 31, 2024 00:18

sfc-gh-cnivera added 2 commits October 31, 2024 15:08

fix workers

edbeb4d

use array_slice and array_unique_agg

3b64e22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize table generation #203

Parallelize table generation #203

sfc-gh-cnivera commented Oct 30, 2024 •

edited

Loading

sfc-gh-cnivera Oct 30, 2024

Parallelize table generation #203

Are you sure you want to change the base?

Parallelize table generation #203

Conversation

sfc-gh-cnivera commented Oct 30, 2024 • edited Loading

sfc-gh-cnivera Oct 30, 2024

Choose a reason for hiding this comment

sfc-gh-cnivera commented Oct 30, 2024 •

edited

Loading