Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace numeric constants with slug values #3569

Closed
jeremystretch opened this issue Oct 4, 2019 · 36 comments
Closed

Replace numeric constants with slug values #3569

jeremystretch opened this issue Oct 4, 2019 · 36 comments
Assignees
Labels
status: accepted This issue has been accepted for implementation type: housekeeping Changes to the application which do not directly impact the end user
Milestone

Comments

@jeremystretch
Copy link
Member

Proposed Changes

Replace the numeric values used for many fields with human-friendly slug values. For example:

RACK_STATUS_RESERVED = 0
RACK_STATUS_AVAILABLE = 1
RACK_STATUS_PLANNED = 2
RACK_STATUS_ACTIVE = 3
RACK_STATUS_DEPRECATED = 4
RACK_STATUS_CHOICES = [
    [RACK_STATUS_ACTIVE, 'Active'],
    [RACK_STATUS_PLANNED, 'Planned'],
    [RACK_STATUS_RESERVED, 'Reserved'],
    [RACK_STATUS_AVAILABLE, 'Available'],
    [RACK_STATUS_DEPRECATED, 'Deprecated'],
]

would become

RACK_STATUS_RESERVED = 'active'
RACK_STATUS_AVAILABLE = 'planned'
RACK_STATUS_PLANNED = 'reserved'
RACK_STATUS_ACTIVE = 'available'
RACK_STATUS_DEPRECATED = 'deprecated'
RACK_STATUS_CHOICES = [
    [RACK_STATUS_ACTIVE, 'Active'],
    [RACK_STATUS_PLANNED, 'Planned'],
    [RACK_STATUS_RESERVED, 'Reserved'],
    [RACK_STATUS_AVAILABLE, 'Available'],
    [RACK_STATUS_DEPRECATED, 'Deprecated'],
]

Justification

Employing human-friendly slug values make consuming the REST API more convenient. It also allows more human-friendly representations of the pertinent field values in other formats, such as YAML (see #451).

@jeremystretch jeremystretch added status: accepted This issue has been accepted for implementation type: housekeeping Changes to the application which do not directly impact the end user labels Oct 4, 2019
@candlerb
Copy link
Contributor

candlerb commented Oct 4, 2019

Would this also change the database column types to enum?

@jeremystretch
Copy link
Member Author

I don't think Django has built-in support for enumeration types, and adding it probably wouldn't get us much. It would also mean incurring a new migration each time a field had its set of valid choices modified.

@candlerb
Copy link
Contributor

candlerb commented Oct 7, 2019

In that case, surely the numeric values for RACK_STATUS_RESERVED, RACK_STATUS_AVAILABLE etc still need to be preserved in the Python code?

You could instead have RACK_STATUS_DISPLAY_CHOICES and RACK_STATUS_API_CHOICES ?

@jeremystretch
Copy link
Member Author

The point of the FR is to replace the numeric values with strings entirely. This will occur within a migration and be completely transparent to the user.

@candlerb
Copy link
Contributor

candlerb commented Oct 7, 2019

So the database field will be a varchar, not an enum? Got it.

@jeremystretch
Copy link
Member Author

Yep. I should point out that there will almost certainly be a transition period where the API will display and accept both the numeric and string values.

@jeremystretch
Copy link
Member Author

Something else occurs to me: By switching from integers to strings we'll lose the ability to order interfaces logically by type (ordering by type would be alphabetical using strings). I don't think this much of a concern but I wanted to put it out there.

@DanSheps
Copy link
Member

DanSheps commented Oct 7, 2019

Could we define a sort order or a sort function for those perhaps?

I don't think the UI has any place to sort by them right now though, so I don't know if it is worth it until there is plans to actually have sorting for them.

@jeremystretch
Copy link
Member Author

AFAIK we've only ever sorted interfaces by name. The order in which the constants are currently defined is largely arbitrary anyway.

@jeremystretch jeremystretch added this to the v2.7 milestone Oct 7, 2019
@candlerb
Copy link
Contributor

candlerb commented Oct 7, 2019

There is an explicit sequence in the list constants e.g. RACK_STATUS_CHOICES, so it would be possible to map back to the index within this list, and sort on that if required.

@jeremystretch
Copy link
Member Author

I should point out that there will almost certainly be a transition period where the API will display and accept both the numeric and string values.

Currently, a choice field is represented like this in the API:

"status": {
    "value": 1,
    "label": "Active"
}

And after the change to slugs, it will look like this:

"status": {
    "value": "active",
    "label": "Active"
}

While it would be possible (although perhaps difficult) to accept either integer and slug (string) values on write requests to maintain backward compatibility, obviously we cannot convey both values in the same field simultaneously.

One thought I had was to introduce a temporary setting, allowing an admin to toggle between the old integer values and the new slug values. This would let the user to designate their own "flag day" for the API change irrespective of the NetBox release cycle. However, I'm not confident that the additional work needed to support this will ultimately be worthwhile, since users are in control of when they upgrade anyway.

@tyler-8
Copy link
Contributor

tyler-8 commented Oct 17, 2019

Honestly I'm conflicted on this one. My main concern would be unintended side-effects in terms of database performance at scale. Seeing that this would be a fairly significant breaking change, if decided as the direction to go, perhaps add some sort of warnings or recurring deprecation message in multiple areas and table it for 2.8? Based on past 2.X releases that would give the community ~3-6 months to adjust their code and workflows after the official decision.

@candlerb
Copy link
Contributor

Database performance is not an issue. There is a massive python-based stack on top; the difference in time to read a string versus an int from the database will be immeasurable in comparison.

Breaking the API for reads bugs me. To avoid it, you'd have to have something like a query-string parameter which selects whether you want numeric or string values - it defaults to numeric initially, and later the default changes to string.

Is there really much value in this? An API is an API, hidden from humans. The "label" is what people see. Any sort of i18n implies that you must keep a mapping from human-readable labels to these codes.

@jeremystretch
Copy link
Member Author

Is there really much value in this?

Yes. People have expressed understandable frustration around having to maintain local integer-to-label mappings for API fields, but what prompted me to open this issue is the need for human-friendly names for interface and port types in support of a device type library (#451).

@tyler-8
Copy link
Contributor

tyler-8 commented Oct 18, 2019

Database performance is not an issue. There is a massive python-based stack on top; the difference in time to read a string versus an int from the database will be immeasurable in comparison.

My concern lies more with indexing (and searching) at scale based on strings vs integers on the database side, rather than Python reading a string or integer. FWIW I think you're probably right though. The cardinality of constants is relatively low (at least for now...)

I'm leaning on the side of keeping existing behavior though. It's mostly a non-issue as is with the choices API endpoints. This seems like a big API change for what amounts to saving maybe one API call. IMO, the amount of work required to change existing integrations with Netbox's API is greater than the amount of work to call the choices API now.

@cimnine
Copy link
Contributor

cimnine commented Oct 18, 2019

First, I'm really in favor of a change in this direction.
But I would like to raise my concern about the specific proposal of changing the value from being an id to a slug:

"status": {
-   "value": 1,
+   "value": "active",
    "label": "Active"
}

As a user of API clients and former author of one, I would not be happy to see the type of the 'value' field change (from int to string). This might have unforeseen consequences on any statically typed API clients. They have to be re-written (that's the small part) and all the code, that goes with them, as well. Most of them would have to introduce special behavior to support Netbox versions prior and after this change.

Hence I suggest to introduce a 'slug' field instead (which might or might not get removed in the future):

"status": {
    "value": 1,
    "slug": "active",
    "label": "Active"
}

This introduces the possibility for having at least a grace period, in which both - new and old - are available. Also, it does not change the type of a field to string that was – until now and by definition – clearly an int.

Further, it would open the possibility to just leave the value as it is for the foreseeable future, as it doesn't actually hurt to have both, does it?

@candlerb
Copy link
Contributor

I would like to throw in one other option: keep the values in the database as integers, but make them foreign keys into a separate enumeration table for each status type. This means that:

  • the values remain integers, and no migration is required
  • the values become user-manageable (you can add your own Status values) - this is a long-requested feature
  • validity is enforced by the database
  • the API can return both integer and slug, as suggested by @cimnine (which incidentally I agree with)

I believe this won't have any measurable impact on performance. The FK relationship creates an index on the tiny enum/slug tables only; it doesn't affect the structure of the main tables which link to them. And the small number of enum/slug rows will be cached in RAM.

@jeremystretch
Copy link
Member Author

This introduces the possibility for having at least a grace period, in which both - new and old - are available.

This would require NetBox to support both values in parallel. This is a nonstarter for me as that's simply not a reasonable burden to place on the maintainers: we have to pick one or the other.

This might have unforeseen consequences on any statically typed API clients. They have to be re-written (that's the small part) and all the code, that goes with them, as well.

I think this is a reasonable requirement. The most popular API client is pynetbox, and implementing this change should be pretty straightforward.

This introduces the possibility for having at least a grace period, in which both - new and old - are available.

This is simply delaying a breaking change. Long-term we want to keep the value field as slug is less intuitive. So, we either change value from an integer to a slug in one go, or we introduce slug first and then eventually rename it to value for the same end result. It's much easier to manage one change than two.

@jeremystretch
Copy link
Member Author

I would like to throw in one other option: keep the values in the database as integers, but make them foreign keys into a separate enumeration table for each status type.

The plan is to introduce enumerated choices for all ChoiceFields once we move to Django 3.0.

I believe this won't have any measurable impact on performance.

Introducing ForeignKey relationships incurs additional JOINs. This was the primary reason for using static values originally.

the values become user-manageable (you can add your own Status values) - this is a long-requested feature

I won't get into this as I've already provided my stance, but I will point out that we're talking about all choice fields, including those which have no reasonable use case for user-defined values, such as power phases or cable length unit.

@DanSheps
Copy link
Member

I definitely support this change, it would also allow us to take care of some "binary" fields (connected/planned).

@lampwins
Copy link
Contributor

For me, this is bringing up that we need to have a larger discussion around API versioning in general.

By definition this is a breaking change and absent either backward-compatible API versioning or a "grace period" in which the current integer values are supported, this puts certain classes of users in a tough spot. From the user's perspective, it is not as simple as saying "update your client."

That might be true if the only client in question is local ad-hoc scripts, but imagine cases where multiple other systems have been integrated with netbox. When the user wants to upgrade netbox, all of those clients have to be updated in sync, otherwise, things break with this kind of change. So if we don't allow a sort of grace period here, we force users into atomic upgrades across their automation infrastructures.

@jeremystretch
Copy link
Member Author

So if we don't allow a sort of grace period here, we force users into atomic upgrades across their automation infrastructures.

But no one is being forced to upgrade to v2.7 immediately. It's true that development stops on v2.6, but it should be reasonable to run a stable release of the v2.6 train for some time. It's a trade-off: make the necessary adaptations in order to get new features. While I believe every reasonable effort should be made to maintain backward compatibility for some period, everyone seems to have a different idea of what's reasonable.

Maybe instead of spending effort on ensuring some measure of backward compatibility between releases, we throw it out completely and instead begin supporting multiple trains. For example, keep releasing v2.6 versions with bug fixes only (no new features) in parallel with v2.7. When v2.8 comes out, kill v2.6 and change v2.7 to bug fixes only. IMO this is likely to be less of a burden for maintainers and more convenient for customers need more time to adapt.

@tyler-8
Copy link
Contributor

tyler-8 commented Oct 18, 2019

The problem still lies in the transition of dependent-services. If the solution is to "wait until everything supports the new scheme" then that means the release of N-number of new versions external scripts/apps/services at the exact same time as the Netbox upgrade. That's untenable for most environments.

to @lampwins point, if there was some sort of API versioning in place, for example v2.7.0 provided these two endpoints:

/api/dcim/interfaces/
/api/v2/dcim/interfaces/

That would allow each individual external app/service/script to migrate in its own time, on separate release windows from the Netbox upgrade. This is a far better experience all around and makes the upgrade process much easier to go through.

From a maintainer perspective, really it's just a separate API module at that point, the code for the two shall remain separate - and you only move new code to the api_v2.py module when there's actually a v2 - keeping it clear where things are.

jeremystretch added a commit that referenced this issue Nov 26, 2019
jeremystretch added a commit that referenced this issue Nov 26, 2019
jeremystretch added a commit that referenced this issue Nov 26, 2019
jeremystretch added a commit that referenced this issue Nov 26, 2019
jeremystretch added a commit that referenced this issue Nov 28, 2019
jeremystretch added a commit that referenced this issue Nov 28, 2019
jeremystretch added a commit that referenced this issue Nov 28, 2019
jeremystretch added a commit that referenced this issue Dec 10, 2019
jeremystretch added a commit that referenced this issue Jan 28, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Mar 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: accepted This issue has been accepted for implementation type: housekeeping Changes to the application which do not directly impact the end user
Projects
None yet
Development

No branches or pull requests

9 participants