Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize table generation #203

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

sfc-gh-cnivera
Copy link
Collaborator

@sfc-gh-cnivera sfc-gh-cnivera commented Oct 30, 2024

Investigating getting this working in SiS

When generating a semantic model, we currently iterate through all of the tables that the user has specified and generate a representation for each. This is done sequentially, which leads to long latencies. This PR parallelizes the table generation which drastically reduces generation time.

For example, here are the generation times for a large semantic model containing 20 tables from COVID19_EPIDEMIOLOGICAL_DATA.

Sequential processing:

2024-10-30 13:54:26.497 | INFO - Time to generate semantic model: 187.59221196174622 seconds.

Concurrent processing:

2024-10-30 13:49:39.478 | INFO - Time taken to generate semantic model: 29.945907831192017 seconds.

fqn_table = create_fqn_table(table)
fqn_databse_schema = f"{fqn_table.database}.{fqn_table.schema_name}"

if fqn_databse_schema not in unique_database_schema:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we were using unique_database_schema anywhere, so I removed it

@sfc-gh-cnivera sfc-gh-cnivera marked this pull request as ready for review October 30, 2024 21:01
@sfc-gh-cnivera sfc-gh-cnivera marked this pull request as draft October 31, 2024 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant