Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

amundsendatabuilder Tests don't pass on master #605

Closed
dorianj opened this issue Aug 10, 2020 · 7 comments
Closed

amundsendatabuilder Tests don't pass on master #605

dorianj opened this issue Aug 10, 2020 · 7 comments

Comments

@dorianj
Copy link
Contributor

dorianj commented Aug 10, 2020

When I download a fresh copy of amundsendatabuilder, the unit test suite doesn't pass. I'm actively debugging this, but posting to track and see if anyone else has seen this.

Expected Behavior

git clone git@github.com:lyft/amundsendatabuilder.git
cd amundsendatabuilder
conda create -n adb-test python=3.7
conda activate adb-test
pip install -r requirements.txt
pip install ".[all]"
pip install codecov
make test

Test suite passes.

Current Behavior

Failures:
=== 24 failed, 221 passed in 20.72 seconds ===

I can grab the full list output, but it's extremely long because it includes some function bodies.

Possible Solution

The failures don't seem to follow any clear pattern, it's not immediately clear to me that there's a shared root cause, but I'll keep looking.

Steps to Reproduce

Repro steps in Expected Behavior.

Screenshots (if appropriate)

Context

Your Environment

Some info about my system:

> uname -a
 Darwin vostok 19.5.0 Darwin Kernel Version 19.5.0: Tue May 26 20:41:44 PDT 2020; root:xnu-6153.121.2~2/RELEASE_X86_64 x86_64
> python --version
Python 3.7.7
  • Amunsen version used: master
  • Data warehouse stores: N/A
  • Deployment (k8s or native): N/A
  • Link to your fork or repository: N/A
@alevene
Copy link
Contributor

alevene commented Aug 10, 2020

If your error set is the same as mine, I found that the failures were due to google authentication errors on the BigQuery extractors. I got around this by doing the following:

Create a file named tests/google_default_credentials.json with the following content

{
  "client_id": "",
  "client_secret": "",
  "refresh_token": "",
  "type": "authorized_user"
}

Then, reference that file in an env var. I'm using a docker setup for testing that copies the content root into /app in the container.

GOOGLE_CREDS="GOOGLE_APPLICATION_CREDENTIALS=/app/tests/google_default_credentials.json"

docker build --tag=amundsenimage:1.2.3 .
docker run amundsenimage:1.2.3 --env=$GOOGLE_CREDS make test

Not sure how this is resolved in the databuilder Travis CI pipeline however.

@dorianj
Copy link
Contributor Author

dorianj commented Aug 10, 2020

@alevene HUGE help, thank you! That fixes 23 of the errors. Going to commit that fix. The other failure looks unrelated but easy to handle. Will PR both together.

@jornh
Copy link
Contributor

jornh commented Aug 10, 2020

Not sure how this is resolved in the databuilder Travis CI pipeline however.

Hmm, perhaps the CI isn't failing because the Travis CI environment for some reason already has such a credentials file?

I'm guessing without that dummy file calling BigQuery Extractor init() will end in calling out to google auth library around https://github.com/lyft/amundsendatabuilder/blob/master/databuilder/extractor/base_bigquery_extractor.py#L58

I noticed a similar thing in the Cassandra test code. There it is handled by patching out like:
https://github.com/lyft/amundsendatabuilder/blob/381a417077e11c128d549e4212e3bc08f342d21f/tests/unit/extractor/test_cassandra_extractor.py#L17-L18

  • is that perhaps a better approach?

@dorianj
Copy link
Contributor Author

dorianj commented Aug 10, 2020

I'm playing with inserting a dummy CRED_KEY in those tests, but since it's a service_account (not a authorized_user per the dummy config file mentioned), the library actually tries to parse it, so it can't be totally dummy values.

@dorianj dorianj changed the title amundsendatabuilder Tests don't pass on Mac amundsendatabuilder Tests don't pass on master Aug 10, 2020
@jornh
Copy link
Contributor

jornh commented Aug 10, 2020

Sorry I didn't convey it very clearly. If you don't put in neither a dummy file or key it will fall through to:

else:
                credentials, _ = google.auth.default(scopes=self._DEFAULT_SCOPES)

And then with something like below in the BigQuery tests - you avoid calling the real google.auth.default and you dont have to worry about constructing a parsable dummy key.

@@ -1,7 +1,7 @@
 import logging
 import unittest

-from mock import patch, Mock
+from mock import patch, Mock, MagicMock
 from pyhocon import ConfigFactory

 from databuilder import Scoped
@@ -133,7 +133,8 @@ class MockBigQueryClient():
     def tables(self):
         return self.tables_method

-
+# patch fallback auth method to avoid actually calling google API during tests
+@patch('google.auth.default', lambda scopes: ['dummy', 'dummy'])
 class TestBigQueryMetadataExtractor(unittest.TestCase):
     def setUp(self):
         # type: () -> None

@dorianj
Copy link
Contributor Author

dorianj commented Aug 10, 2020

Ah, thanks @jornh, yes I was confused. PR'd at amundsen-io/amundsendatabuilder#313 -- will close this once that lands.

Turns out the rest of the failures were from me using Python 3.7. Opened a new issue for that: #606

@dorianj
Copy link
Contributor Author

dorianj commented Aug 11, 2020

Fixed in amundsen-io/amundsendatabuilder#313

@dorianj dorianj closed this as completed Aug 11, 2020
dorianj pushed a commit to dorianj/amundsen that referenced this issue Apr 25, 2021
* Testing Codeowners file

Signed-off-by: Marcos Iglesias <miglesiasvalle@lyft.com>

* Updating Tamika's name, adding Tao

Signed-off-by: Marcos Iglesias <miglesiasvalle@lyft.com>
feng-tao pushed a commit that referenced this issue May 7, 2021
* Testing Codeowners file

Signed-off-by: Marcos Iglesias <miglesiasvalle@lyft.com>

* Updating Tamika's name, adding Tao

Signed-off-by: Marcos Iglesias <miglesiasvalle@lyft.com>
hansadriaans pushed a commit to DataChefHQ/amundsen that referenced this issue Jun 30, 2022
* Testing Codeowners file

Signed-off-by: Marcos Iglesias <miglesiasvalle@lyft.com>

* Updating Tamika's name, adding Tao

Signed-off-by: Marcos Iglesias <miglesiasvalle@lyft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants