Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit for postgresql provider #3653

Closed
wants to merge 8 commits into from
Closed

Initial commit for postgresql provider #3653

wants to merge 8 commits into from

Conversation

chelarua
Copy link
Contributor

Related to issue #3650.
Contains initial resources for:

  • postgresql_role
  • postgresql_database

Example usage

provider "postgresql" {
  host = "192.168.1.1"
  username = "postgres"
  password = "postgres"
}

resource "postgresql_role" "my_role" {
  name = "my_role"
  login = true
}

resource "postgresql_database" "my_db" {
   name = "my_db"
   owner = "${postgresql_role.my_role.name}"
}

@apparentlymart
Copy link
Contributor

Over in #3122 I called the database resource mysql_database, vs. your postgresql_db. We should be consistent here, but I'm not sure which way to go on this.

In other naming discussions we've leaned towards spelling things out in full to make them clearer, unless the abbreviation is part of a brand name or often used in the provider's own documentation.

The Google DB resource is named google_sql_database_instance, but yet in AWS we have aws_db_instance.

So right now it seems like there is no consistent existing precedent. In that case, I think I'd lean towards the "spelling things out is better" convention and suggest we call this thing postgresql_database. What do you think? Is that a convincing justification? 😀

Type: schema.TypeString,
Required: true,
ForceNew: false,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know Postgres too well so please forgive me if this is a strange question: in Terraform resources we usually try to keep the number of required fields down to a minimum so that the "simple case" of configuration is as small as possible. Is it acceptable to leave owner unset and have Postgres e.g. assume ownership by the user that was specified in the provider configuration?

On the other hand, if setting the owner on a database is something a Postgres user would expect to alwayss have to do then I think that's fine. 😄

@apparentlymart
Copy link
Contributor

This looks like a great start! Thanks @chelarua . I had some small feedback/questions inline but overall this looks great.

If you'd like to also take a pass at documentation, you can get set up with the docs server relatively easily by running vagrant up in the website dir at the top of this repo. In my MySQL patch in #3122 you can see an example of what was required to add docs for the MySQL provider.

I didn't get a chance to test this provider yet since I don't have a Postgres environment to hand. I expect I can get set up with postgres on RDS to test but sadly I don't have the cycles to figure that out today, so I'll have to defer that for another time.

@chelarua
Copy link
Contributor Author

chelarua commented Nov 9, 2015

@apparentlymart thanks for your review, I did some changes and added documentation too.

case err != nil:
return fmt.Errorf("Error reading info about database: %s", err)
default:
d.Set("owner", owner)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do CREATE DATABASE without the WITH OWNER clause above, and then we read here, will this get populated with a default owner?

If so, it's important to mark the owner argument as both Optional and Computed. This tells Terraform that the user can optionally override it (the Optional part) but if they don't then the value will be computed based on something the server knows.

@apparentlymart
Copy link
Contributor

Thanks for the follow-up, @chelarua! I left some more feedback on the new code for making the owner optional. Almost done, I think!

@chelarua
Copy link
Contributor Author

Hi, thanks for the suggestions again, i wasn't aware of that. I extended the test with a second database without explicit owner and indeed it was failing without the computed field set.

@ringods
Copy link
Contributor

ringods commented Nov 10, 2015

@chelarua when you create a role with login permissions, how is the password set? I suspect we are still missing this.

@ringods
Copy link
Contributor

ringods commented Nov 15, 2015

@apparentlymart could you do another review so @chelarua can resolve the last issues?

encryptedCfg := getEncryptedStr(d.Get("encrypted").(bool))

query := fmt.Sprintf("ALTER ROLE %s %s PASSWORD '%s'", pq.QuoteIdentifier(roleName), encryptedCfg, password)
_, err := conn.Query(query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cases like this where you need to do multiple calls to update different parts of the object, and each one of these calls can separately fail, it's good to use a ResourceData mechanism called partial state, which in the event of an error allows Terraform to understand which values have been updated and which haven't, and thus reflect the current values in the state.

The pattern for partial state works like this:

  • At the top of this function, before the first HasChange test, enable partial state mode by calling d.Partial(true).
  • After each one of these conditional calls succeeds, record the partial update by calling (for example) d.SetPartial("password").
  • Just before the return statement at the very end of the function, call d.Partial(false) to disable partial mode.

@apparentlymart
Copy link
Contributor

This patch is shaping up nicely! Thanks for the continued work on it @chelarua, and for the suggestions @ringods.

Just another small bit of feedback on this one. I'm going to try to find some time soon to get set up with a Postgres instance so I can try running the tests. (I'm a total Postgres newbie!)

@ringods
Copy link
Contributor

ringods commented Nov 16, 2015

@apparentlymart your last comments have been processed by @chelarua, my developer & colleague at @ReleaseQueue. Looking forward to your own Postgres test results so this can eventually be merged.

@apparentlymart
Copy link
Contributor

Thanks for the latest round of updates. This looks great!

I created a Postgres instance in Amazon RDS to test this. I mostly just accepted the default settings for a small RDS instance that would be free tier eligible, and asked for the username to be "matkins" and set a password. I then tried the following config (I've obscured parts of it that represent credentials or identifying info):

provider "postgresql" {
  host = "matkins-postgres-test.xxxxx.us-west-2.rds.amazonaws.com"
  port = 5432
  username = "matkins"
  password = "xxxxx"
}

resource "postgresql_database" "main" {
  name = "terraform_test"
}

When I tried to apply this config I got this error:

1 error(s) occurred:

* postgresql_database.main: Error creating postgresql database: pq: database "matkins" does not exist

I got a very similar result when I ran the acceptance tests from this patch against the same RDS instance, apparently when testing the postgresql_role resource:

--- FAIL: TestAccPostgresqlRole_Basic (3.02s)
    testing.go:137: Step 0 error: Error applying: 5 error(s) occurred:

        * postgresql_role.role_with_pwd_encr: Error creating role: pq: database "matkins" does not exist
        * postgresql_role.role_simple: Error creating role: pq: database "matkins" does not exist
        * postgresql_role.myrole2: Error creating role: pq: database "matkins" does not exist
        * postgresql_role.role_with_pwd: Error creating role: pq: database "matkins" does not exist
        * postgresql_role.role_with_pwd_no_login: Error creating role: pq: database "matkins" does not exist

Do either of you see what I'm doing wrong here?

@ringods
Copy link
Contributor

ringods commented Nov 18, 2015

@apparentlymart when you created the instance on the AWS console, did you specify a Database Name in the Database Options section?

@apparentlymart
Copy link
Contributor

@ringods I didn't specify a database in the RDS UI since I was expecting that I'd be creating the database using this provider. Was I supposed to put my username in as the database name? Is every user expected to have a corresponding database of the same name?

Again I must apologize for being a total PostgreSQL newbie here. I just followed the same pattern I've used to make MySQL RDS instances in the past, where I made the actual databases using the similar MySQL provider I worked on in #3122.

@ringods
Copy link
Contributor

ringods commented Nov 18, 2015

@apparentlymart no problem being a newbie. I'm not gonna call myself an expert either. :-)

When a PostgreSQL server is set up, the default administrative database is called postgres. We have reproduced the problem and @chelarua is looking into it. The assumption for now is that we didn't specify postgres as the default database to connect to when creating additional roles and databases.

Expect an additional commit to this pull request to fix this.

@chelarua
Copy link
Contributor Author

I'm seeing another issue when running against a RDS instance, especially the micro versions, it seems that running the tests is quickly eating up all the allowed connections to the PostgreSQL server, which is 26 for the micro versions. I don't have that much experience with terraform plugins and i'm not sure exactly which is the mechanism for closing the connections, maybe @apparentlymart you can point me in the right direction on this one.
Thanks

@ringods
Copy link
Contributor

ringods commented Nov 18, 2015

@apparentlymart we figured out that for the 7 tests in this provider, each test calls 4 times into the Configure function of our provider, leading to 28 connections to the DB server. On AWS, the max_connections setting differs from the default based on the amount of free memory of the underlying instance. See this article for details:

http://dba.stackexchange.com/questions/41829/should-i-increase-max-connections-in-aws-rds-t1-micro-for-mysql

It talks about MySQL, but it's the same approach for PostgreSQL. We are now experimenting with the ResourceProviderCloser interface to see if we can close the db connection correctly in there.

ringods referenced this pull request Nov 18, 2015
Currently Terraform is leaking goroutines and with that memory. I know
strictly speaking this maybe isn’t a real concern for Terraform as it’s
mostly used as a short running command line executable.

But there are a few of us out there that are using Terraform in some
long running processes and then this starts to become a problem.

Next to that it’s of course good programming practise to clean up
resources when they're not needed anymore. So even for the standard
command line use case, this seems an improvement in resource management.

Personally I see no downsides as the primary connection to the plugin
is kept alive (the plugin is not killed) and only unused connections
that will never be used again are closed to free up any related
goroutines and memory.
@apparentlymart
Copy link
Contributor

Hmm interesting... there are some existing examples where Terraform has to abide by either a rate limit or a fixed number of concurrent connections, but in the cases I've worked with that has been handled mainly by the underlying client library.

I don't have much experience with Go's database/sql, and it's quite possible that my MySQL provider has a similar issue that I didn't spot yet because I've mostly been using it for simple cases. It looks like the DB type has a method SetMaxOpenConns which defaults to unlimited but apparently allows the number of connections to be limited on the client side.

So perhaps a reasonable solution is to extend the provider configuration schema with a max_connections attribute that allows the user to set how many connections the given PostgreSQL server supports, give it a conservative default (possibly just 1?) so that most users won't need to set it, and then pass that value into SetMaxOpenConns as part of instantiating the client.

As you guys saw there are currently some problems with Terraform not necessarily closing connections opened by providers, but since connections are managed internally by the DB type as an implementation detail I wonder if we actually need to explicitly close them or if the DB type will just manage this for us automatically; as far as I can tell, one DB object abstracts potentially many concurrent database connections, so presumably DB.Close() is for closing all of the open connections, not just one.

@chelarua
Copy link
Contributor Author

Thanks for the suggestion. I've made some tests with the SetMaxOpenConns, but it seems that limiting the connections is causing it to deadlock. It will output this continuously and never finish:

[DEBUG] vertex root, waiting for: provider.postgresql (close)

I think this is because it only will close the connections on exit.
One immediate solution/workaround would be to no longer make the db connections in the providerConfigure, but only save the configuration for the connection and manually open and close the db connections inside each resource function( create, delete ...).

@ringods
Copy link
Contributor

ringods commented Nov 23, 2015

@apparentlymart we will take the route to open/close the db connection on resource manipulation. But for me the question still remains how the test framework behaves. While the Configure function is invoked correctly once in a regular (production) run, why is the test framework invoking the Configure 4 times per test?

@apparentlymart
Copy link
Contributor

The acceptance test harness executes a few different operations in sequence. It wants to see what happens when it applies the config and when it refreshes after applying the config at least. It also wants to destroy all of the resources it makes during testing.

I must admit I've never looked closely at the details of what the acceptance test harness does, but I think the above could explain at least three complete runs through a Terraform lifecycle, each of which would re-configure the provider as if a user had run Terraform separately several times, with different options.

@ringods
Copy link
Contributor

ringods commented Nov 24, 2015

@apparentlymart seems a fair explanation. @chelarua has updated the internals to open/close the connections when manipulating the resources. There is also fix in that should resolve your initial problem of:

database "matkins" doesn't exist.

Can you retry with the latest code against your RDS PostgreSQL and report your results?

Tnx.

@ringods
Copy link
Contributor

ringods commented Dec 2, 2015

@apparentlymart do you have an update? It seems we missed the 0.6.8 release...

@apparentlymart
Copy link
Contributor

@ringods sorry I am currently a bit snowed under with other projects so I can't take a look at this right now but I'll try to get to it in the next few weeks.

apparentlymart added a commit that referenced this pull request Dec 4, 2015
@apparentlymart
Copy link
Contributor

I squashed this into a single commit and made some small organizational tweaks for consistency with other providers, and then merged this as e1eef15.

Thanks for all the amazing and patient work on this! 🚢

@ringods
Copy link
Contributor

ringods commented Dec 16, 2015 via email

@0x7061
Copy link

0x7061 commented Jan 31, 2017

Stupid question but isn't is possible to use interpolation for the provider variables? Like:

provider "postgresql" "pg" {
    alias = "pg"
    host = "${data.terraform_remote_state.remote-state-db.address}"
    username = "${data.terraform_remote_state.remote-state-db.username_root}"
    password = "${data.terraform_remote_state.remote-state-db.password_root}"
}

It's always asking me to input values in the console.

@0x7061
Copy link

0x7061 commented Jan 31, 2017

My bad! Wasn't connecting databases & roles via the provider field!

@ringods ringods deleted the provider-postgresql branch March 25, 2017 14:55
@ghost
Copy link

ghost commented Apr 15, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants