Skip to content

Conversation

@acrmp
Copy link
Contributor

@acrmp acrmp commented Dec 14, 2022

Thanks for contributing to cloud_controller_ng. To speed up the process of reviewing your pull request please provide us with:

  • A short explanation of the proposed change:

⚠️ Requires Diego changes to be completed, merged and released ⚠️

Modifies the cloud controller to send a DesiredLRPUpdate with updated metric tags to Diego when an application is renamed.

  • The diego client will be using the new /v1/desired_lrp/update.r1 BBS endpoint.
  • The protobuf DesiredLRPUpdate message now includes the metric tags.
  • If the app is started we issue a DesiredLRPUpdate message for each process on the app.
  • If the app is staging we don't issue the DesiredLRPUpdate. We hadn't thought too hard about this so let us know if this makes sense or not.
  • We don't issue the DesiredLRPUpdate when the v2 API is used. The sync should take care of this with a small delay.

We noticed that the CAPI pipeline uses protoc v3.6.1 which is quite an old version. We used that version to regenerate the protobuf client code in this PR but it may be worth considering upgrading. We also wondered if the associated concourse task script should include set -e.

  • An explanation of the use cases your change solves

Pending changes to Diego, this should allow renames to take effect without requiring an application restart.

  • Links to any other associated PRs

cloudfoundry/diego-release#662

  • I have reviewed the contributing guide

  • I have viewed, signed, and submitted the Contributor License Agreement

  • I have made this pull request to the main branch

  • I have run all the unit tests using bundle exec rake

  • I have run CF Acceptance Tests

@Gerg
Copy link
Member

Gerg commented Dec 14, 2022

Without looking at this PR in much detail, it appears y'all implemented this differently than I would have.

I would have expected the change to go through the ProcessObserver. For more (overwhelming) detail, see this wiki page.

Happy to chat more about this, if y'all want to.

@acrmp
Copy link
Contributor Author

acrmp commented Dec 15, 2022

@Gerg Thanks. We chatted about this with @sethboyles and @dalvarado and agreed to modify the action rather than changing process observer. @sethboyles liked having the logic in the action and it wasn't clear how to access the app name change from process observer.

@acrmp
Copy link
Contributor Author

acrmp commented Dec 15, 2022

@sethboyles @dalvarado Thanks for the feedback yesterday. We pushed an additional commit d953945 to move the Desired LRP Update outside of the database transaction.

@acrmp
Copy link
Contributor Author

acrmp commented Dec 15, 2022

@sethboyles @dalvarado We pushed another commit 1afdda2 to not refer to a new revision of the desired lrp update endpoint. We agreed with @winkingturtle-vmw that introducing a new revision wasn't necessary.

@ctlong ctlong force-pushed the pr/update-metric-tags-on-rename branch from 1afdda2 to b5365dc Compare January 19, 2023 20:03
@acrmp
Copy link
Contributor Author

acrmp commented Jan 30, 2023

@sethboyles We saw that the changes in this PR currently aren't sufficient to ensure that the metric tags are updated when the initial call from CAPI to the BBS fails.

The processes sync job uses the process updated_at timestamp to determine whether it needs to update the diego process, and this hasn't changed when an app is renamed.

  1. One alternative is to update the timestamp on associated processes when renaming the app so that the convergence loop will notice and synchronise with the BBS. This can be as simple as calling save on each process, though it seems strange to call it just for the side effect of updating the updated at timestamp. Looking into testing this approach and asserting that the process timestamp is updated in the database we also saw that it doesn't play nicely with Timecop because we have a sequel current_timestamp extension that will ignore the Timecop clock.

  2. Another alternative might be to change the processes sync job to also refer to the app updated_at when determining whether to update the process at the BBS. This wouldn't require the process timestamps to be touched but we're not sure if there are other app updates that we might want not to cause updates to the BBS.

Do y'all have a preferred option?

@Gerg
Copy link
Member

Gerg commented Jan 30, 2023

If you do end up changing the sync logic, I'll draw your attention to https://github.com/cloudfoundry/sync-integration-tests.

@acrmp
Copy link
Contributor Author

acrmp commented Jan 31, 2023

Another alternative might be to change the processes sync job to also refer to the app updated_at when determining whether to update the process at the BBS.

Though thinking about it that might require saying where annotation < app updated_at which means we'd be going from treating the annotation as an opaque field to requiring it to be an epoch time. I don't think I like that.

@acrmp
Copy link
Contributor Author

acrmp commented Feb 1, 2023

We pushed a commit that updates the process timestamps when renaming an app to ensure that convergence takes place.

We're not entirely happy with the sleep added to the tests. We had trouble getting Timecop to play nicely with the Sequel current_datetime_timestamp extension, if you have an alternative approach to implementing this that would be great too.

@Gerg
Copy link
Member

Gerg commented Feb 1, 2023

I think TimeCop is not working because the Sequel plugin uses database-native time functions.

You probably need to spy/mock things at a higher level (before it issues the DB query).

@acrmp
Copy link
Contributor Author

acrmp commented Feb 2, 2023

We pushed another commit with a different testing approach using doubles instead. We get to remove the sleep but the test is possibly not quite as good.

acrmp and others added 5 commits February 14, 2023 22:39
cloudfoundry/diego-release#662

Co-authored-by: Rebecca Roberts <robertsre@vmware.com>
Bumping the endpoint revision was determined not to be necessary, see
this PR for more context:
cloudfoundry/bbs#51

cloudfoundry/diego-release#662
- The process timestamp is compared against the Diego LRP annotation for
  convergence
- Save the processes associated with the app so that the timestamp is
  updated and convergence will take place
- The tests added in this commit use sleep (which we're not happy about)
  but we had difficulty setting an old timestamp due to the Sequel
  current_datetime_timestamp extension

Signed-off-by: Rebecca Roberts <robertsre@vmware.com>
- Use doubles instead of real models to avoid sleeping in the tests

Co-authored-by: Andrew Crump <crumpan@vmware.com>
@acrmp acrmp force-pushed the pr/update-metric-tags-on-rename branch from a8a31cf to aba8ec2 Compare February 15, 2023 01:10
@evanfarrar
Copy link
Member

It sounds like the strategy from earlier guidance has led y'all down a fine path.

LGTM, for now

@acrmp acrmp marked this pull request as ready for review February 22, 2023 21:31
@ctlong
Copy link
Member

ctlong commented Mar 2, 2023

Please feel free to squash merge this PR if y'all want to keep your history more clean.

That would get rid of some of the back and forth commits that undo each other lol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants