Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance when running dbt (in docker) #1002

Closed
elexisvenator opened this issue Sep 14, 2018 · 5 comments
Closed

Slow performance when running dbt (in docker) #1002

elexisvenator opened this issue Sep 14, 2018 · 5 comments

Comments

@elexisvenator
Copy link
Contributor

dbt seems to be very slow when it is executed. This issue happens inconsistently across our developer's machines without any reliable way of reproducing. This changed when we started using docker, which was consistently slow.

Key characteristics of the behaviour are:

  • Affects dbt seed and dbt run, does not affect dbt test
  • dbt appears to pause for 1 second between executing each model. This gets really painful in development when there are a lot of models.
  • no attempt at concurrency is made.

I have made a minimal repository that reproduces the issue, the scripts are configured for windows but can be converted to run on linux/mac as well, as long as docker is available.
https://github.com/elexisvenator/dbt-performance-issue/tree/master

Running the script ./data_transform.sh will result in the following output. Note the time between each executed step compared to that step's duration.

Found 8 models, 18 tests, 0 archives, 0 analyses, 153 macros, 1 operations, 4 seed files

06:31:11 | Concurrency: 4 threads (target='dev')
06:31:11 |
06:31:11 | 1 of 4 START seed file dbt_load.load_provider........................ [RUN]
06:31:11 | 2 of 4 START seed file dbt_load.load_job............................. [RUN]
06:31:11 | 3 of 4 START seed file dbt_load.load_brand........................... [RUN]
06:31:11 | 4 of 4 START seed file dbt_load.load_position........................ [RUN]
06:31:13 | 1 of 4 OK loaded seed file dbt_load.load_provider.................... [INSERT 4 in 0.16s]
06:31:14 | 2 of 4 OK loaded seed file dbt_load.load_job......................... [INSERT 2 in 0.28s]
06:31:15 | 4 of 4 OK loaded seed file dbt_load.load_position.................... [INSERT 7 in 0.23s]
06:31:16 | 3 of 4 OK loaded seed file dbt_load.load_brand....................... [INSERT 2 in 0.26s]
06:31:16 |
06:31:16 | Finished running 4 seeds in 4.29s.

Completed successfully

Done. PASS=4 ERROR=0 SKIP=0 TOTAL=4
Found 8 models, 18 tests, 0 archives, 0 analyses, 153 macros, 1 operations, 4 seed files

06:31:22 | Concurrency: 4 threads (target='dev')
06:31:22 |
06:31:22 | 1 of 8 START view model dbt_base.base_position....................... [RUN]
06:31:22 | 2 of 8 START view model dbt_base.base_job............................ [RUN]
06:31:22 | 3 of 8 START view model dbt_base.base_provider....................... [RUN]
06:31:22 | 4 of 8 START view model dbt_base.base_brand.......................... [RUN]
06:31:23 | 3 of 8 OK created view model dbt_base.base_provider.................. [CREATE VIEW in 0.13s]
06:31:24 | 1 of 8 OK created view model dbt_base.base_position.................. [CREATE VIEW in 0.16s]
06:31:25 | 4 of 8 OK created view model dbt_base.base_brand..................... [CREATE VIEW in 0.13s]
06:31:26 | 2 of 8 OK created view model dbt_base.base_job....................... [CREATE VIEW in 0.16s]
06:31:26 | 5 of 8 START table model dbt_dw.dim_position......................... [RUN]
06:31:26 | 6 of 8 START table model dbt_dw.bridge_position_hierarchy............ [RUN]
06:31:27 | 6 of 8 OK created table model dbt_dw.bridge_position_hierarchy....... [SELECT 14 in 0.11s]
06:31:28 | 5 of 8 OK created table model dbt_dw.dim_position.................... [SELECT 7 in 0.12s]
06:31:28 | 7 of 8 START table model dbt_dw.dim_provider......................... [RUN]
06:31:29 | 7 of 8 OK created table model dbt_dw.dim_provider.................... [SELECT 4 in 0.11s]
06:31:29 | 8 of 8 START table model dbt_dw.dim_job.............................. [RUN]
06:31:30 | 8 of 8 OK created table model dbt_dw.dim_job......................... [SELECT 2 in 0.06s]
06:31:30 |
06:31:30 | Finished running 4 view models, 4 table models in 8.18s.

Completed successfully

Done. PASS=8 ERROR=0 SKIP=0 TOTAL=8
Found 8 models, 18 tests, 0 archives, 0 analyses, 153 macros, 1 operations, 4 seed files

06:31:35 | Concurrency: 4 threads (target='dev')
06:31:35 |
06:31:36 | 1 of 18 START test null_when_parent_column_null_dim_provider_parent_position_name__parent_position_id [RUN]
06:31:36 | 2 of 18 START test not_null_bridge_position_hierarchy_superior_position_id [RUN]
06:31:36 | 3 of 18 START test test_all_columns_match_naming_convention.......... [RUN]
06:31:36 | 4 of 18 START test test_all_tables_match_naming_convention........... [RUN]
06:31:36 | 1 of 18 PASS null_when_parent_column_null_dim_provider_parent_position_name__parent_position_id [PASS in 0.11s]
06:31:36 | 5 of 18 START test relationships_dim_provider_position_id__parent_position_id__ref_dim_position_ [RUN]
06:31:36 | 4 of 18 PASS test_all_tables_match_naming_convention................. [PASS in 0.09s]
06:31:36 | 6 of 18 START test unique_dim_position_position_id................... [RUN]
06:31:36 | 3 of 18 PASS test_all_columns_match_naming_convention................ [PASS in 0.10s]
06:31:36 | 2 of 18 PASS not_null_bridge_position_hierarchy_superior_position_id. [PASS in 0.13s]
06:31:36 | 7 of 18 START test relationships_bridge_position_hierarchy_position_id__superior_position_id__ref_dim_position_ [RUN]
06:31:36 | 8 of 18 START test test_bridge_position_hierarchy_max_recursion...... [RUN]
06:31:36 | 5 of 18 PASS relationships_dim_provider_position_id__parent_position_id__ref_dim_position_ [PASS in 0.06s]
06:31:36 | 9 of 18 START test not_null_dim_provider_provider_id................. [RUN]
06:31:36 | 6 of 18 PASS unique_dim_position_position_id......................... [PASS in 0.05s] in 0.05s]
06:31:36 | 10 of 18 START test unique_dim_provider_provider_id.................. [RUN]position_id__ref_dim_position_ [PASS in 0.07s]                                                                                      on_name__parent_position_id [RUN]
06:31:36 | 7 of 18 PASS relationships_bridge_position_hierarchy_position_id__superior_ in 0.09s]position_id__ref_dim_position_ [PASS in 0.07s]                                         in 0.06s]
06:31:36 | 11 of 18 START test null_when_parent_column_null_dim_position_parent_position_name__parent_position_id [RUN]                                                     _id [RUN]
06:31:36 | 8 of 18 PASS test_bridge_position_hierarchy_max_recursion............ [PASS in 0.07s] in 0.09s]
06:31:36 | 9 of 18 PASS not_null_dim_provider_provider_id....................... [PASSe__parent_position_id [PASS in 0.07s] in 0.06s]                                                                            bordinate_position_id__ref_dim_position_ [RUN]
06:31:36 | 12 of 18 START test not_null_dim_position_position_id................ [RUN] in 0.06s]                                                                                      __position_id [RUN]
06:31:36 | 13 of 18 START test not_null_bridge_position_hierarchy_subordinate_position in 0.06s]_id [RUN]                                                                             ASS in 0.08s]
06:31:36 | 10 of 18 PASS unique_dim_provider_provider_id........................ [PASSf_dim_position_ [RUN] in 0.07s]                                                                            _id__ref_dim_position_ [RUN]
06:31:36 | 14 of 18 START test not_null_bridge_position_hierarchy_levels_apart.. [RUN]ate_position_id__ref_dim_position_ [PASS in 0.08s]                                                                                      tion_id [PASS in 0.07s]
06:31:36 | 11 of 18 PASS null_when_parent_column_null_dim_position_parent_position_namef_dim_position_ [PASS in 0.05s]e__parent_position_id [PASS in 0.07s]                                                 position_ [PASS in 0.07s]
06:31:36 | 15 of 18 START test relationships_bridge_position_hierarchy_position_id__subordinate_position_id__ref_dim_position_ [RUN]06:31:36 | 12 of 18 PASS not_null_dim_position_position_id...................... [PASS in 0.06s]
06:31:36 | 16 of 18 START test null_when_parent_column_null_dim_provider_position_name__position_id [RUN]06:31:36 | 14 of 18 PASS not_null_bridge_position_hierarchy_levels_apart........ [PASS in 0.06s]
06:31:36 | 13 of 18 PASS not_null_bridge_position_hierarchy_subordinate_position_id [PASS in 0.08s]06:31:36 | 17 of 18 START test relationships_dim_provider_position_id__position_id__ref_dim_position_ [RUN]
06:31:36 | 18 of 18 START test relationships_dim_position_position_id__parent_position_id__ref_dim_position_ [RUN]06:31:36 | 15 of 18 PASS relationships_bridge_position_hierarchy_position_id__subordinate_position_id__ref_dim_position_ [PASS in 0.08s]
06:31:36 | 16 of 18 PASS null_when_parent_column_null_dim_provider_position_name__position_id [PASS in 0.07s]06:31:36 | 18 of 18 PASS relationships_dim_position_position_id__parent_position_id__ref_dim_position_ [PASS in 0.05s]
06:31:36 | 17 of 18 PASS relationships_dim_provider_position_id__position_id__ref_dim_position_ [PASS in 0.07s]
Completed successfully
Done. PASS=18 ERROR=0 SKIP=0 TOTAL=18

I am happy to help with any troubleshooting :)

@drewbanin
Copy link
Contributor

Thanks @elexisvenator! The repo you linked is going to be super useful here. We just cooked up an issue to add performance instrumentation which should definitely help us pinpoint any issues here!

We'll dig into this and let you know what we find

@elexisvenator
Copy link
Contributor Author

Hi @drewbanin
I jumped ahead with the profiler in #1020 and it came back with some interesting results.
Attached is the profiling output for running dbt run on this project. local_run_output.profile is running dbt from my OS, and container_run_output.profile is running dbt from within the container. In both cases, the same postgres database was targeted and the threadcount was set to 1.

image
Run locally

image
Run in container

There seems to be a lot of thread waiting for no good reason.

profile_output.zip

@elexisvenator
Copy link
Contributor Author

@drewbanin
Local dev build times are approaching 4 minutes now :( so I spend some more time playing with the profiling.

I modified runner.py to not use a threadpool, instead of pool.imap_unordered() it just uses a regular map(). Got some interesting results:

  • Thread management is not the problem, dbt is just as slow as it was before.
  • The profiler is a lot easier to read now as several layers of thread management are removed.

And the winner is....
image
SSL Handshakes!

Yes, of the 191 seconds it takes to process 159 models, it spends 154s processing ssl and connections. Does this mean that connections are not being cached correctly?

Also, I have only witness this happen inside a docker container talking to a postgres db also in docker.
Hopefully this helps you find the cause. (profiler output attached)
output.zip

@drewbanin
Copy link
Contributor

Thanks @elexisvenator! I DM'd you with some questions. I think we can get to the root of this, then update here for anyone else having this issue

@drewbanin
Copy link
Contributor

drewbanin commented Oct 15, 2018

@elexisvenator I recorded an issue over here: #1063

The near-term workaround is to disable tracking in your profiles.yml file as shown here: https://docs.getdbt.com/docs/configure-your-profile#section-usage-statistics

Thanks for digging into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants