Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docker-based testing for Linux users #2741

Merged
merged 1 commit into from
Sep 16, 2020
Merged

Fix docker-based testing for Linux users #2741

merged 1 commit into from
Sep 16, 2020

Conversation

heisencoder
Copy link
Contributor

This change enables Linux users to run the dbt tests via the docker image. It follows the recommendations from this article: https://jtreminio.com/blog/running-docker-containers-as-current-host-user/

Notable changes:

  • Added new Makefile rule to generate a .env file that contains USER_ID and GROUP_ID environment variables to the ID of the current user. This is in turn used by docker-compose and the Dockerfile to make the Docker image run as the current user. (Note that on Windows or Mac, this behavior happens by default).
  • Reordered Dockerfile to allow for better caching of intermediate images (i.e., put things that don't depend on ARGS earlier).

resolves #2739

Description

This change enables Linux users to run the dbt tests via the docker image.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

@cla-bot cla-bot bot added the cla:yes label Sep 3, 2020
@beckjake
Copy link
Contributor

beckjake commented Sep 4, 2020

Hmm. Do you have any idea why the integration tests wouldn't be able to pick things up from their environment now @heisencoder? That's the key difference between postgres and the other tests in CI.

I can rebuild the test container and push a :8 tag if that's all that is required.

@heisencoder
Copy link
Contributor Author

I'll have a chance to dig in more this afternoon...

@jtcohen6 jtcohen6 changed the base branch from dev/marian-anderson to dev/0.18.1 September 4, 2020 19:28
@heisencoder
Copy link
Contributor Author

I was not able to reproduce the BigQuery failures by running locally: $ docker-compose run test tox -e integration-bigquery-py36. I think this is related to the CircleCI environment somehow.

@beckjake , how is the CI testing run? Does it invoke the Makefile or otherwise use docker-compose? The change I made creates a new .env file with two fields in it. If the CircleCI stuff is also using a .env file, then my changes will overwrite those environment variables. I could look into a way to instead append those two lines to a possibly existing .env file.

@beckjake
Copy link
Contributor

beckjake commented Sep 5, 2020

We set them as environment variables in circleci itself, and circleci runs the test in a docker container and stuffs them in somehow (-e, I guess?). They do some sort of magic but I assumed it was layered on top of any docker-specific env files. The reason I wondered if I have to push a new image is that we specify the docker container to use here: https://github.com/fishtown-analytics/dbt/blob/dev/marian-anderson/.circleci/config.yml#L5

The alternative, as far as I could tell, was to have circle build the docker image fresh each time. I'm far too impatient for that to happen every test run!

@heisencoder
Copy link
Contributor Author

@beckjake In the CircleCI config, how is the fishtownanalytics/test-container image built? Is that just based on Dockerfile in the top-level directory?

@beckjake
Copy link
Contributor

beckjake commented Sep 9, 2020

@heisencoder yup. I build it locally with docker build . and then tag + push it.

@heisencoder
Copy link
Contributor Author

Having dug into this more, I don't see how any of the files that I've changed would affect any of these integration tests, since the CircleCI tests have an independent config from the docker-compose.yml file. My Dockerfile changes will be relevant, though, after the new fishtownanalytics/test-container image gets built, since I've updated it to require parameters, but that wouldn't explain these failures. I'll look into making these parameters optional.

My best guess right now is that my personal CircleCI project for dbt has no environment variables in the CircleCI project settings, and every database but Postgres uses secrets (Postgres has the environment variables hard-coded in .circleci/config.yml).

@beckjake does that match your understanding that I would need to add personal environment variables for CircleCI testing?

@beckjake
Copy link
Contributor

beckjake commented Sep 9, 2020

Because it's a PR against our repository, I don't think you should have to, other PRs seem to have no trouble running those tests.

Tomorrow night or Friday I'll build a v9 of this container with the new Dockerfile (I forgot I already built a v8 we never used) and push that, and you can point the circleci.yml file at :9 and we'll see what happens? I can't think of any reason this would matter either, but it's all I can come up with.

@heisencoder
Copy link
Contributor Author

I added my own environment parameters to my CircleCI config for BigQuery, and confirmed that it is fact hitting my BigQuery project that is referenced by my JSON Service Account credentials. So I'm guessing either CircleCI's offerings have changed, or that I somehow accidentally set up CircleCI differently than other contributors.

BigQuery had these three failures, which I'm assuming are because I misconfigured my BigQuery setup (in particular, I didn't set BIGQUERY_POLICY_TAG):

FAILED test/integration/022_bigquery_test/test_bigquery_update_columns.py::TestBigqueryUpdateColumnPolicyTag::test__bigquery_update_column_policy_tag
FAILED test/integration/026_aliases_test/test_aliases.py::TestSameAliasDifferentDatabases::test__bigquery_same_alias_succeeds_in_different_schemas
FAILED test/integration/054_adapter_methods_test/test_adapter_methods.py::TestGrantAccess::test_bigquery_adapter_methods

Snowflake and Redshift failed due to missing environment variables (I don't have these databases set up so didn't add environment variables for them).

I've uploaded a new change that allows the Dockerfile to run without explicit parameters for USER_ID and GROUP_ID, which will be needed for when we recompile the Docker image for CircleCI integration tests.

@beckjake if this change otherwise looks good, if you can build and tag my Dockerfile then I can update the CircleCI config to test out the new image so that we can confirm that it doesn't break the integration tests. Thanks!

@beckjake
Copy link
Contributor

@heisencoder - I've pushed fishtownanalytics/test-container:9

@heisencoder
Copy link
Contributor Author

Thanks @beckjake ! I've updated the reference from 7 to 9 and re-pushed.

The integration tests are failing as expected because I only set up the environment for BigQuery, but not for Redshift or Snowflake. I got the same BigQuery failures mentioned earlier. Do you have an option for running the integration tests against the Fishtown environment?

I'll also rebase my commit history to clean up the merges.

@beckjake
Copy link
Contributor

I will push a copy of this branch on my end to see if that works, but something is not right here. The tests should work for you without my intervention, we changed our circleci config a while ago to pass secrets from forked pull requests.

So either the circleci setting is not being honored (surely we're not the only ones using this and it must have annoyed many people), or something about this branch in particular is making it not work.

@heisencoder
Copy link
Contributor Author

Here's an example BigQuery failure when I created the pull request: https://app.circleci.com/pipelines/github/heisencoder/dbt/6/workflows/60484f20-575d-43eb-a715-045180679e80/jobs/45

Here's a recent one that uses the fishtown-analytics: https://app.circleci.com/pipelines/github/fishtown-analytics/dbt/1220/workflows/421bf035-cf21-46e3-9fcf-f7230b710b9e/jobs/30852

CircleCI for me is picking the heisencoder organization instead of fishtown-analytics. I did go through some setup that CircleCI was requesting, so maybe I somehow configured it to use my personal organization instead. I'll pick through my CircleCI config to see what might have happened.

@beckjake
Copy link
Contributor

Yeah, I saw that too! Do you have a circleci account tied to your github already? I wonder if that is involved.

@beckjake
Copy link
Contributor

Also, ignore those snowflake failures about SSO tokens , that's an internal process failure on our end!

@heisencoder
Copy link
Contributor Author

I just went into my CircleCI Project Settings for the heisencoder organization, and on the Overview tab, clicked on 'Stop Building', which then caused heisencoder / dbt to no longer be followed by CircleCI. I then rebased and force-pushed my change to kick off the tests again. This time, it looks like CircleCI is correctly picking the fishtown-analytics organization.

When I set up CircleCI, I thought that this was necessary for seeing the integration test results, but apparently it was an optional flow. CircleCI is pretty aggressive about getting your organization set up!

@heisencoder
Copy link
Contributor Author

Jake, I think this pull request is otherwise ready for your review, although the last bit of testing would be to patch this branch onto a MacOS and try the tests locally. I've only got access to Linux right now. Thanks!

@beckjake
Copy link
Contributor

Well, I'm glad you've tracked this down, I was very confused!

On the plus side, we got a new test container pushed so local and the test container are at parity, which seems good regardless! I'm going to give this a shot locally later.

@heisencoder
Copy link
Contributor Author

I was just thinking that it's necessary to run docker-compose build before make test. If I add docker-compose build to the .env build rule in the Makefile, then that manual step won't be needed. I'll add that the Makefile real quick.

See #2739

This change enables Linux users to run the dbt tests via the docker
image. It follows the recommendations from this article:
https://jtreminio.com/blog/running-docker-containers-as-current-host-user/

Notable changes:
*  Added new Makefile rule to generate a .env file that contains USER_ID
and GROUP_ID environment variables to the ID of the current user. This
is in turn used by docker-compose and the Dockerfile to make the Docker
image run as the current user. (Note that on Windows or Mac, this
behavior happens by default).
*  Reordered Dockerfile to allow for better caching of intermediate
images (i.e., put things that don't depend on ARGS earlier).
*  Bumped CircleCI's Dockerfile from version 7 to 9.  Jake rebuilt
9 off of this PR.
@heisencoder
Copy link
Contributor Author

I just pushed a version that adds docker-compose build to the Makefile's .env rule.

Copy link
Contributor

@beckjake beckjake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! I have one suggestion to minimize any user issues, but after that I'd love to get this in for 0.18.1. Thanks for all your hard work on this @heisencoder.

Comment on lines +33 to +34
ARG USER_ID
ARG GROUP_ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we default these to the empty string somehow? It works, but locally I get this:

WARNING: The USER_ID variable is not set. Defaulting to a blank string.
WARNING: The GROUP_ID variable is not set. Defaulting to a blank string.

And while I can do make .env on my mac, I know we'll have to answer questions about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not responding sooner! I just got back from a 4-day vacation.

I've got a quick fix for this in #2775

Copy link
Contributor

@beckjake beckjake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let's just merge this as-is. I would like this in for 0.18.1 and I can iterate on this to remove that warning later.

@beckjake beckjake merged commit 22c4d8f into dbt-labs:dev/0.18.1 Sep 16, 2020
@jtcohen6 jtcohen6 mentioned this pull request Nov 10, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't find tox config file when running make test-unit
2 participants