Spin up separate database for pytest execution #235

earmenda · 2021-11-18T05:35:15Z

Closes #121

Code Changes

Added a test mode that allows the server to use the test_database_url config to access the database
Update make pytest target to set FIDESCTL_TEST_MODE to true
Update make api and make cli to allow passing in a FIDESCTL_TEST_MODE. make api FIDESCTL_TEST_MODE=True
Add tests to test_config.py

Steps to Confirm

Verify that test database is created if FIDESCTL_TEST_MODE is true
Verify that test database is used if FIDESCTL_TEST_MODE is true

Pre-Merge Checklist

All CI Pipelines Succeeded
Documentation Updated

Description Of Changes

The original issue Have Pytest spin up its own test database #121 I think had the impression that this was natively supported but it requires a script to be executed which comes from a github repo. See here: https://stackoverflow.com/questions/46668233/multiple-databases-in-docker-and-docker-compose
~~Test database creation is now done through a fixture~~
~~Server context for tests is set through an api but I am still thinking about whether this is a good way to go about this~~
Added a test mode to the server which is configured through an environment variable FIDESCTL_TEST_MODE
If in test mode the config test_database_url will be used to access the database
Updated make commands for pytest and also launching api/cli using test mode

earmenda · 2021-11-18T05:36:32Z

In this current state the two databases are created successfully but it requires we add this script that comes from a separate repo. It is possible to use a image that includes this script but I don't see why we would do that tbh as it would be limiting.

postgres=# \l
                                     List of databases
     Name      |  Owner   | Encoding |  Collate   |   Ctype    |     Access privileges      
---------------+----------+----------+------------+------------+----------------------------
 fidesctl      | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =Tc/postgres              +
               |          |          |            |            | postgres=CTc/postgres     +
               |          |          |            |            | fidesctl=CTc/postgres
 fidesctl_test | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =Tc/postgres              +
               |          |          |            |            | postgres=CTc/postgres     +
               |          |          |            |            | fidesctl_test=CTc/postgres

earmenda · 2021-11-18T05:37:40Z

One alternative that is mentioned in the stackoverflow post i linked is to just have two different containers each with a different database to be created. This to me seems almost like a better solution just for the sake of simplicity.

services: 

  db:
    image: postgres
    environment: 
      - POSTGRES_DB
      - POSTGRES_USER
      - POSTGRES_PASSWORD
    ports:
      - ${POSTGRES_DEV_PORT}:5432
    volumes:
      - app-volume:/var/lib/postgresql/data

  db-test:
    image: postgres
    environment: 
      - POSTGRES_DB
      - POSTGRES_USER
      - POSTGRES_PASSWORD
    ports:
      - ${POSTGRES_TEST_PORT}:5432
    # Notice I don't even use a volume here since I don't care to persist test data between runs

volumes:
  app-volume: #

ThomasLaPiana · 2021-11-18T06:11:36Z

One alternative that is mentioned in the stackoverflow post i linked is to just have two different containers each with a different database to be created. This to me seems almost like a better solution just for the sake of simplicity.
services: 

  db:
    image: postgres
    environment: 
      - POSTGRES_DB
      - POSTGRES_USER
      - POSTGRES_PASSWORD
    ports:
      - ${POSTGRES_DEV_PORT}:5432
    volumes:
      - app-volume:/var/lib/postgresql/data

  db-test:
    image: postgres
    environment: 
      - POSTGRES_DB
      - POSTGRES_USER
      - POSTGRES_PASSWORD
    ports:
      - ${POSTGRES_TEST_PORT}:5432
    # Notice I don't even use a volume here since I don't care to persist test data between runs

volumes:
  app-volume: #

my workflow is usually to run make cli and then pytest from within that shell, so I'd want this container to run as long as I'm running any docker shells. does it seem to be a bit weird to be running two containers in parallel?

Even though its maybe a bit more clunky, i think using the script is a little bit smoother and more scalable for now. There's also a fair argument to be made for what this even solves given our current setup. Hopefully people wouldn't be spinning up a database with those credentials and then running the tests in their production environment?

ThomasLaPiana · 2021-11-18T06:12:23Z

I'm ok with either solution but I'm now wondering if there is an actual problem to solve here, but I'm probably overlooking something

@NevilleS @seanpreston @ethyca/fides-control any thoughts here?

brentonmallen1 · 2021-11-18T15:16:09Z

It seems like the original concern was to avoid potential impact to production data. I think that's probably worth putting some safe guards around. The script seems like a better option than trying to a esoteric docker image, imho. I would suggest maybe adding a comment stating where it came from for posterity, though.

ThomasLaPiana · 2021-11-18T18:07:45Z

It seems like the original concern was to avoid potential impact to production data. I think that's probably worth putting some safe guards around. The script seems like a better option than trying to a esoteric docker image, imho. I would suggest maybe adding a comment stating where it came from for posterity, though.

we have a separate test_config.toml file though, so unless someone runs a production application using our docker-compose file (they shouldn't!) and uses the exact same credentials (they shouldn't!) I don't think this issue would happen? When this issue was created, we were still pointing people at our own make commands/compose file, but the setup/installation guide is completely different now

NevilleS · 2021-11-18T19:09:32Z

That script (that leverages postgres' init scripts) isn't the only way to create a DB though- in fidesops we have code that creates a database from with python using the alembic schema. That feels like the right layer to put this kind of logic: in the application code instead of the docker configuration.

In other words, if the pytest command is what creates & destroys the test db, you make it very easy to avoid any accidents (both in production and in local dev)

NevilleS · 2021-11-18T19:10:37Z

e.g. see https://github.com/ethyca/fidesops/blob/aec6b8cbb820c891f1b8e7eaaf16a09041efb660/tests/conftest.py#L56

ThomasLaPiana · 2021-11-18T21:36:20Z

e.g. see https://github.com/ethyca/fidesops/blob/aec6b8cbb820c891f1b8e7eaaf16a09041efb660/tests/conftest.py#L56

nice! beautiful solution

earmenda · 2021-11-22T16:48:30Z

@NevilleS @ThomasLaPiana

Is it possible that the solution posted doesn't actually do what it is meant to? At least for our fidesctl setup it doesn't seem to work well because tests are running in a different process than the api service.

When we run make api usually we start a server in one process and then run pytest in a separate process. The usage of os.environ["TESTING"] = "True" in fidesops is odd for our use case because it would only set this variable for the test process. If fidesops works the same then I don't think setting this variable is doing anything when it is picked up by the server later on:

https://github.com/ethyca/fidesops/blob/3acd69ea1fbdc2e3ba7fabff89ae240ded577b8e/src/fidesops/db/session.py#L25

Maybe I am missing something but you shouldn't be able to modify the environment variables from a different python process.

For now I posted a working solution where we temporarily set the context server side through the api. This is working as we'd want and achieves the same as the fidesops example. This solution still feels not great to me though, but I can't think of any other way that the tests could interact with the server's context other than through the api.

ThomasLaPiana · 2021-11-22T22:17:57Z

I agree it feels super weird to have code logic dedicated to spinning up a test database, that feels super bad.

If the assumption here is that this ticket exists specifically so that people don't overwrite their production data, what is the simplest way to do that? I think the simplest way is to do the following:

add a new variable to the config called test_database_url
in conftest.py check if test_database_url is set
- If test_database_url is not set, throw an error and exit the test suite
- if test_database_url matches database_url, throw an error and exit the test suite
- If test_database_url is set and does not match database_url, set the value of database_url to that of test_database_url. This ensures that other than a new config value, no other code logic needs to change outside of conftest.py

@earmenda does this implementation make sense? does it seem reasonably elegant? Open to other suggestions, after some thought this is the cleanest I could think of

earmenda · 2021-11-22T22:25:59Z

@ThomasLaPiana

I think the new config is a good idea but that still means that the execution of pytest would have to indicate to the server that it needs to use that database. Don't we still have that problem with this solution? Were you thinking that the service has a flag which either starts it in default or test domain to determine which database to use?

earmenda · 2021-11-22T23:21:51Z

Here is what I currently see as some options

(current approach) Allow tests to temporarily change the context on the server through api - I don't really like the idea of context switching within the application, even worse having an api for it
When running tests the server can be spun up using a test context which points to the test database url. make api --domain=test - I think this is a good option, you get to determine which database to point at when you start up the service
Add a flag in api calls which tell the server to use a test database url for the current call - This could work, I am not very familiar with our api framework yet but I imagine based on a common parameter used by all the apis we could define a general behavior.
Have alternative test server or endpoints which are stood up with the main api service - eh?

ThomasLaPiana · 2021-11-23T00:23:42Z

Here is what I currently see as some options

(current approach) Allow tests to temporarily change the context on the server through api - I don't really like the idea of context switching within the application, even worse having an api for it

When running tests the server can be spun up using a test context which points to the test database url. make api --domain=test - I think this is a good option, you get to determine which database to point at when you start up the service

Add a flag in api calls which tell the server to use a test database url for the current call - This could work, I am not very familiar with our api framework yet but I imagine based on a common parameter used by all the apis we could define a general behavior.

Have alternative test server or endpoints which are stood up with the main api service - eh?

I think the second option is the best. Maybe the API can check for a FIDESCTL_TEST_MODE=TRUE env var or something and if it is set to true, it will use the test_database_url from the config

earmenda · 2021-11-23T20:28:46Z

@ThomasLaPiana

What do you think about this?

Added a check for FIDESCTL_TEST_MODE environment variable to know which database to access
configure_db creates the database only if FIDESCTL_TEST_MODE is enabled

With this tests don't really have to do too much differently. I know you mentioned adding a check for whether the test database is being used from the tests but I don't see that much value to that to be honest.

If this looks good I just need to add the variable to our make commands. We could make sure that make pytest uses the test database and potentially add a make test-cli and make test-api targets. What do you think?

fidesctl/fidesctl.toml

fidesctl/src/fidesctl/core/config.py

earmenda · 2021-11-24T01:14:12Z

With the latest change it is possible to start the server using make by passing in test mode as a parameter

make api FIDESCTL_TEST_MODE=True
make cli FIDESCTL_TEST_MODE=True

Verified that this starts the service in test mode

Test mode is enabled, creating test database if needed

earmenda · 2021-11-24T01:22:03Z

Verified that starting the server in test mode will use the expected database

fidesctl_test=# \l
                                   List of databases
     Name      |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   
---------------+----------+----------+------------+------------+-----------------------
 fidesctl      | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 
 fidesctl_test | postgres | UTF8     | en_US.utf8 | en_US.utf8 |

Verified that only the test database is used even after successfully running pytest

fidesctl=# \dt
Did not find any relations.

earmenda · 2021-11-24T01:39:17Z

Added some tests to validate the setting of database_url through validator in test_config.py

Makefile

ThomasLaPiana · 2021-11-24T05:10:48Z

@earmenda sorry for hijacking this at the end! You nailed the requirements, I just had a few nits and didn't want to keep kicking it back to you for those small things

Thank you!

env_files/fidesctl.env

earmenda · 2021-11-24T05:13:03Z

@earmenda sorry for hijacking this at the end! You nailed the requirements, I just had a few nits and didn't want to keep kicking it back to you for those small things

Thank you!

no worries, just added a comment to your change though

ThomasLaPiana

LGTM!

Create additional database on creating postgres container

fbc5251

earmenda added the enhancement label Nov 18, 2021

earmenda added this to the Fidesctl 1.1 release milestone Nov 18, 2021

earmenda requested review from ThomasLaPiana, brentonmallen1 and PSalant726 November 18, 2021 05:35

earmenda self-assigned this Nov 18, 2021

ThomasLaPiana marked this pull request as ready for review November 18, 2021 06:12

ThomasLaPiana marked this pull request as draft November 18, 2021 21:36

Create database in conftest and set context temporarily

6154d89

Eduardo Armendariz added 2 commits November 23, 2021 12:05

Add a test database mode for the server to use

800a79b

Clean up CI and other changes

fbdbbc0

ThomasLaPiana reviewed Nov 23, 2021

View reviewed changes

fidesctl/fidesctl.toml Outdated Show resolved Hide resolved

ThomasLaPiana reviewed Nov 23, 2021

View reviewed changes

fidesctl/src/fidesctl/core/config.py Outdated Show resolved Hide resolved

Eduardo Armendariz added 4 commits November 23, 2021 16:34

Change api config to automatically populate database_url

e719f2d

Clean up unused code and typings

2e96ef5

Merge main into branch

7ca3bb0

Update api and cli targets to allow for setting a test mode

8557693

Eduardo Armendariz added 2 commits November 23, 2021 17:31

Update documentation to document test_database_url config

68151c5

Add tests for database_url validator depending on FIDESCTL_TEST_MODE

5b7d2e3

earmenda marked this pull request as ready for review November 24, 2021 02:01

make some tweaks to minimize changes

29a1389

earmenda commented Nov 24, 2021

View reviewed changes

Makefile Outdated Show resolved Hide resolved

Thomas La Piana added 2 commits November 23, 2021 20:00

minor nits/cleanup

f40f54c

final nits, all checks passing

35d5ef4

earmenda commented Nov 24, 2021

View reviewed changes

env_files/fidesctl.env Show resolved Hide resolved

ThomasLaPiana approved these changes Nov 24, 2021

View reviewed changes

ThomasLaPiana merged commit 5618f69 into main Nov 24, 2021

ThomasLaPiana deleted the earmenda-pytest-isolated-database branch November 24, 2021 05:31

ThomasLaPiana pushed a commit that referenced this pull request Aug 17, 2022

Create codeql-analysis.yml (#235)

7fbf861

ThomasLaPiana pushed a commit that referenced this pull request Sep 26, 2022

Create codeql-analysis.yml (#235)

fa0b06a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spin up separate database for pytest execution #235

Spin up separate database for pytest execution #235

earmenda commented Nov 18, 2021 •

edited

Loading

earmenda commented Nov 18, 2021

earmenda commented Nov 18, 2021

ThomasLaPiana commented Nov 18, 2021

ThomasLaPiana commented Nov 18, 2021

brentonmallen1 commented Nov 18, 2021

ThomasLaPiana commented Nov 18, 2021

NevilleS commented Nov 18, 2021

NevilleS commented Nov 18, 2021

ThomasLaPiana commented Nov 18, 2021

earmenda commented Nov 22, 2021

ThomasLaPiana commented Nov 22, 2021

earmenda commented Nov 22, 2021 •

edited

Loading

earmenda commented Nov 22, 2021 •

edited

Loading

ThomasLaPiana commented Nov 23, 2021

earmenda commented Nov 23, 2021

earmenda commented Nov 24, 2021

earmenda commented Nov 24, 2021

earmenda commented Nov 24, 2021

ThomasLaPiana commented Nov 24, 2021

earmenda commented Nov 24, 2021

ThomasLaPiana left a comment

Spin up separate database for pytest execution #235

Spin up separate database for pytest execution #235

Conversation

earmenda commented Nov 18, 2021 • edited Loading

Code Changes

Steps to Confirm

Pre-Merge Checklist

Description Of Changes

earmenda commented Nov 18, 2021

earmenda commented Nov 18, 2021

ThomasLaPiana commented Nov 18, 2021

ThomasLaPiana commented Nov 18, 2021

brentonmallen1 commented Nov 18, 2021

ThomasLaPiana commented Nov 18, 2021

NevilleS commented Nov 18, 2021

NevilleS commented Nov 18, 2021

ThomasLaPiana commented Nov 18, 2021

earmenda commented Nov 22, 2021

ThomasLaPiana commented Nov 22, 2021

earmenda commented Nov 22, 2021 • edited Loading

earmenda commented Nov 22, 2021 • edited Loading

ThomasLaPiana commented Nov 23, 2021

earmenda commented Nov 23, 2021

earmenda commented Nov 24, 2021

earmenda commented Nov 24, 2021

earmenda commented Nov 24, 2021

ThomasLaPiana commented Nov 24, 2021

earmenda commented Nov 24, 2021

ThomasLaPiana left a comment

Choose a reason for hiding this comment

earmenda commented Nov 18, 2021 •

edited

Loading

earmenda commented Nov 22, 2021 •

edited

Loading

earmenda commented Nov 22, 2021 •

edited

Loading