Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename db init to db create-default-connections #32420

Closed
1 task done
eladkal opened this issue Jul 7, 2023 · 8 comments
Closed
1 task done

Rename db init to db create-default-connections #32420

eladkal opened this issue Jul 7, 2023 · 8 comments
Assignees
Labels
area:CLI good first issue kind:task A task that needs to be completed as part of a larger issue

Comments

@eladkal
Copy link
Contributor

eladkal commented Jul 7, 2023

Body

As discussed in #32048 the airflow db init is very confusing to users.

The goal of this task is to rename this command so it will be more self explained.

The task:

  1. Rename airflow db init to airflow db create-default-connections
  2. Deprecate airflow db init
  3. Update docs

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@eladkal eladkal added area:CLI good first issue kind:task A task that needs to be completed as part of a larger issue labels Jul 7, 2023
@Adaverse
Copy link
Contributor

Adaverse commented Jul 7, 2023

I can take this up!

@potiuk
Copy link
Member

potiuk commented Jul 7, 2023

Just a note - we should also keep the 'db init` working but raising a deprecation warnings (but it should not be documented). This is something we've done in the past (I recall we have a mechanism for that).

Once we do this rename, we can consider also next step (this is where the whole discussion started with @eladkal) to indeed move the connection definition to providers. Basically that should be a new information exposed by get_provider_info - "default_connection" dictionary and exposed by Provider Manager. This is a little more complex though. We need think how to make sure backwards compatibility (and what it really means) with older providers which do not expose this information, so I'd say this should be done as follow up and some discussion over a PR.

@Adaverse
Copy link
Contributor

Adaverse commented Jul 7, 2023

Sure @potiuk noted!

@Adaverse
Copy link
Contributor

Adaverse commented Jul 8, 2023

Just a note - we should also keep the 'db init` working but raising a deprecation warnings (but it should not be documented). This is something we've done in the past (I recall we have a mechanism for that).

We cannot remove db init from doc since we are only removing the create connections logic into db create-default-connections but the rest of the logic -

This command init the database if it dosen't exist, upgrade it if exist, load the default pools and synchronizes log_filename_template with the db

will still be there in db init. Unless we move it somewhere else (maybe db upgrade or maybe new command), we cannot fully remove it from the doc.

@potiuk
Copy link
Member

potiuk commented Jul 8, 2023

We cannot remove db init from doc since we are only removing the create connections logic into db create-default-connections but the rest of the logic

As commented in #32048 (comment) - we could hide/deprecate "init" and create a new command ("intiialize") and doument it's behaviour (and change it in the way that it will only work on an empty database and fails (and recommends to use upgrade) if it is non-empty.

Side (but related) comment.

There is one problem with two separate init (even if we rename it to initialize) and upgrade commands.

While it is "logical" to have those two commands, in fact it is rather difficult to use in "automated" cases - for example in our Helm Chart we never use init - always upgrade. This is very clear example where we want a command that will achieve the "target" state, no matter what initial stage is (in this case upgrade works in the way that it achieves the "target" state of database in the "current" version - no matter what the initial state was.

This case is well known and it's a great property of such commands - and this is the whole reason why Terraform exists. This is the whole promise of Terraform, that all the terraform providers work this way: you describe the target state you want to achieve and Terraforms figures out how to get there.

In this sense, I think maybe we should just forget the "init" or "initialize" - and keep only one command (current upgrade) that could manage the state of the database? Maybe it should not even be named "upgrade" ? Maybe it should be airflow db sync for example?

@Adaverse
Copy link
Contributor

Adaverse commented Jul 8, 2023

Hmm, so its better to work towards the final target, i.e, to create airflow db sync that would handle db creation if doesn't exist or synchronize if does, create default connections (of installed providers in separate PR, maybe), load the default pools and synchronizes log_filename_template. And deprecate db init and db upgrade.

And keep, creation of default connections (of installed providers in separate PR, maybe) in a separate command i.e db create-default-connections

@potiuk
Copy link
Member

potiuk commented Jul 8, 2023

Hmm, so its better to work towards the final target, i.e, to create airflow db sync that would handle db creation if doesn't exist or synchronize if does, create default connections (of installed providers in separate PR, maybe), load the default pools and synchronizes log_filename_template. And deprecate db init and db upgrade.

I'd keep the "default connections" out of the "sync" scope. The "structure" of the db can be synced rather easily, but syncing DB rows is a much more complex subject and there are many edge cases, and I think this should be an explicit, separate command to create those. At least that's the guts feeling I have.

@Adaverse
Copy link
Contributor

Adaverse commented Jul 8, 2023

Let's see what others think of it.

cc - @eladkal @utkarsharma2 @hussein-awala (from #32048)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:CLI good first issue kind:task A task that needs to be completed as part of a larger issue
Projects
None yet
Development

No branches or pull requests

3 participants