Fix origin reactivation logic for admin interface #561

chrisgresty · 2019-12-10T09:11:05Z

(See #518 )
Reactivation of an origin should follow the existing logic as described here for the Origin Toggle: https://github.com/HotelsDotCom/styx/blob/master/docs/user-guide/admin-interface.md

i.e. if health checks are enabled then a reactivated origin goes into an unreachable state, and it is the health check service that moves it into an active state.

mikkokar · 2019-12-11T08:55:57Z

components/proxy/src/main/kotlin/com/hotels/styx/services/HealthCheckMonitoringService.kt

@@ -116,6 +116,11 @@ internal class HealthCheckMonitoringService(
        futureRef.get().cancel(false)
    }

+    fun isRunning() : Boolean {
+        val executor = futureRef.get() ?: return false
+        return !executor.isCancelled && !executor.isDone


Minor comment, but it is a ScheduledFuture that is stored in futureRef, not an executor.

A similar comment to below about taking a more functional approach can be made here, too, which removes the name of that val altogether.

mikkokar · 2019-12-11T09:13:06Z

components/proxy/src/main/kotlin/com/hotels/styx/services/OriginsAdminHandler.kt

+                    && (provider.styxService as HealthCheckMonitoringService).isRunning()
+        } ?: return false
+        return true
+    }


I would use more "functional" approach:

This eliminates three return statements

Therefore might be bit easier to follow?

Bit more idiomatic in relation to our existing Kotlin codebase?

private fun hasActiveHealthCheck2(origin: RoutingObjectRecord) = lbGroupTag .find(origin.tags) ?.let { appId -> serviceDb.entrySet() .firstOrNull { (_, provider) -> provider.type == HEALTH_CHECK_MONITOR && provider.tags.contains(targetTag(appId)) && (provider.styxService as HealthCheckMonitoringService).isRunning() } } == null

But this is just my personal preference.

Here is some more info about Kotlin scope functions: https://kotlinlang.org/docs/reference/scope-functions.html

mikkokar · 2019-12-11T09:38:56Z

components/proxy/src/test/kotlin/com/hotels/styx/services/OriginsAdminHandlerTest.kt

+            expectStateChange(STATE_INACTIVE, STATE_ACTIVE, STATE_UNREACHABLE, false, "app.origin.hc.running")
+        }
+
+        // TODO: Complete this test


3xTODOs here. Are these still work in progress? Or just old comments?

Ah, these are old. Will remove.

mikkokar · 2019-12-11T09:45:02Z

components/proxy/src/test/kotlin/com/hotels/styx/services/OriginsAdminHandlerTest.kt

+        // TODO: Complete this test
+        scenario("Activating a healthchecked (but the healthchecker is stopped), inactive origin results in an unreachable state") {
+            expectStateChange(STATE_INACTIVE, STATE_ACTIVE, STATE_ACTIVE, false, "app.origin.hc.stopped")
+        }


Please remind me, when do we end up stopping a health checker (if that is active)?

In practice, only when the origins config is updated to remove healthchecking configuration for an origin altogether. Which does mean that this case is unlikely. But I was bothered by the fact that the service does support being stopped, and if it was, then there would be nothing to move the state of the origin from unreachable to active. So, straying into YAGNI territory a bit, I guess. Happy to excise this if you reckon it's not needed.

So, straying into YAGNI territory a bit, I guess. Happy to excise this if you reckon it's not needed.

No this is an important consideration. Just that this scenario didn't immediately occur to me.

Based on your response, I understand there are two possible scenarios:

Health check configuration is removed from origins configuration.

Origins configuration remains unchanged, but somebody stops the service over the admin interface?

Case 1: Removal of health check configuration only:

Looking at code, in this case the host proxy is re-tagged with STATE_ACTIVE. Which seems correct.

Looking at updateRoutingObjects in YamlFileConfigurationService, I do see a possible bug. The update function doesn't take the tags into account. Suppose the tags alone change, but object's YAML configuration remains unchanged. This code doesn't seem to update the underlying routing object record in this case: WDYT?

objectDefs.forEach { objectDef -> routeDb.compute(objectDef.name()) { previous -> if (previous == null || changed(objectDef.config(), previous.config)) { // Possible optimisation: check if the tags alone changed, or if the configuration changed too. previous?.routingObject?.stop() converter.routingObjectRecord(objectDef) } else { previous } } }

Case 2: Looking at code, it removes the health check tag, and sets state=active.

Are there any other scenarios we'd have to take into account?

The update function doesn't take the tags into account.

One side-effect of doing so is that, even if you haven't actually changed the tags in the config file, but the tags have changed at runtime, then you will reset the tags to the configured set each time you reload the origins file. So if you've deactivated an origin via the admin interface, you will reactivate it by reloading the origins file, even if you didn't change anything about that origin in the origins file.

The current implementation also resets the tags, but only if there has been a config change for that origin. Which seems a bit more reasonable, even if not perfect.

chrisgresty added 2 commits December 10, 2019 09:01

Fix origin reactivation logic for admin interface

322185a

Trigger travis rebuild

0c35732

mikkokar reviewed Dec 11, 2019

View reviewed changes

mikkokar approved these changes Dec 11, 2019

View reviewed changes

More functional implementation

ed35a9a

chrisgresty merged commit 619e6fc into ExpediaGroup:master Dec 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix origin reactivation logic for admin interface #561

Fix origin reactivation logic for admin interface #561

chrisgresty commented Dec 10, 2019

mikkokar Dec 11, 2019

chrisgresty Dec 11, 2019

mikkokar Dec 11, 2019

mikkokar Dec 11, 2019

mikkokar Dec 11, 2019

chrisgresty Dec 11, 2019

mikkokar Dec 11, 2019

chrisgresty Dec 11, 2019

mikkokar Dec 11, 2019

chrisgresty Dec 11, 2019 •

edited

Loading

chrisgresty Dec 11, 2019

Fix origin reactivation logic for admin interface #561

Fix origin reactivation logic for admin interface #561

Conversation

chrisgresty commented Dec 10, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrisgresty Dec 11, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chrisgresty Dec 11, 2019 •

edited

Loading