Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(clustering) retiring Serf - new cache invalidation mechanism #2561

Merged
merged 12 commits into from
Jun 16, 2017

Conversation

thibaultcha
Copy link
Member

@thibaultcha thibaultcha commented May 25, 2017

This rewrite makes Kong independent when dealing with cache invalidation events. This opens the way to #2355 and to a simpler installation and usage experience for Kong.

Incentive

As we know, Kong caches datastore entities (configuration entities mostly, such as APIs, Consumers, Plugins, Credentials...) to prevent unnecessary overhead and database round trips on a per-request basis. As of Kong 0.10, Serf has always been the dependency Kong relies upon to broadcast invalidation events to other members of the cluster. This approach is not without compromises, the main ones being:

  • A dependency to a third-party software (not Nginx/OpenResty related).
  • The need for a CLI to start and stop Kong, vs. the Nginx binary: because the Serf agent must be started as a daemon.
  • This makes deployments and containerization somewhat awkward.
  • Invoking Serf from a running Kong node requires some blocking I/O.

New Implementation

TLDR: database polling of a time-series entity called "cluster events".

Cluster events are implemented in both PostgreSQL and Cassandra. A Kong node can broadcast() an event on containing data (text) on a channel, and other Kong nodes can subscribe() to a channel and receive said events at a configurable interval.

Here is a high-level, non-exhaustive list of changes:

  • New cluster_events.lua module. An abstract framework for both PostgreSQL and Cassandra to broadcast messages across the cluster.
  • As per our defined constant, events stay in the database for 1 day (TTL)
  • For Cassandra, our table contains events with a partition key made of the channel they are published to. More details in the related commit, but this means we are quite efficient in R/W operations on this table, but that we can handle roughly 2 billion events a day. TTLs are natively supported.
  • For PostgreSQL, we use a trigger on insert to handle the TTL of expired events.
  • Cassandra has pagination support when reading events, PostgreSQL doesn't (yet).
  • The interval at which events are polled is configurable (5 seconds by default).
  • To mitigate distributed environments with eventual consistency behaviors, we introduce a configurable "propagation delay" to delay events handling until the data should have been propagated. This is an initial implementation.
  • The new mlcache.lua module is a temporary module waiting the completion of lua-resty-mlcache, and potentially the replacement of lua-resty-worker-events.
  • A new cache.lua module replaces the old database_cache.lua module. It handles inter-worker and inter-nodes invalidation, and relies on mlcache.lua for better performance and more powerful features.
  • All entities accessed via database_cache.get_or_set() have been replaced.
  • We try to provide node-consistency (request made on Admin API of a node instantly gets reflected in the node's proxy server) for entities. Currently, this is only done for the router (could be improved). Other entities like the balancer will need some refactoring to easily achieve this, and smaller, single entities will need an IPC module offering more consistency.
  • To generate the cache key of an entity, we introduce a new schema property in most entities: cache_key, which can be a table taking multiple values. This makes cache key generation based on variable(s) easier and safer.
  • Cached entities now have a TTL (previously was "forever") of 1 hour, and 5 minutes for a lookup miss. This is a change of default behavior as a safeguard against potentially stale data.
  • As another safeguard to potentially safe data, the /cache endpoint on the Admin API has been kept.
  • We remove mediator_lua and use lua-resty-worker-events instead. The later can achieve the job we used the former for, and has more essential features for our usage. This was a duplicate/unnecessary dependency.
  • We created the skeleton for a pub/sub API to subscribe to CRUD operations on database entities. Such as apis:updated or consumers:deleted. Plugins should be able to listen on those events and act accordingly.
  • With this change, Kong becomes purely stateless. There is no nodes table anymore storing each and every node currently connected to the cluster. This is by design for now, until the need to do so eventually arises.
  • Core entities invalidation tests have been moved to a new test suite: spec/04-invalidations and are written in a more efficient manner than the previously existing tests, avoiding too many "sleep" calls (NOP) to wait for events propagation. This is thanks to the new design and will be improved with a better upcoming IPC module ensuring inter-worker consistency for each request - later on.

Path forward

This should be reviewed and heavily tested. There should be some more tests written, especially around the cluster_events module itself. A particular attention should be given to the invalidation of core entities, and attention given there would be greatly appreciated.

Reviews and comments are welcome, questions as well! Local testing of this would be extremely helpful.

Regarding CI, the Travis builds might still fail at this point, partly for unrelated reasons (non-deterministic tests), partly because this is still in a "polishing" state.

Here is a small list of leftovers todos:

  • Run the new invalidation test suite on Travis CI
  • Find a way to test the quiet option of the DAO (they are currently commented out for linting reasons)
  • Write more tests for the cluster_events module with both databases
  • Audit the lua-resty-lock options of the mlcache module.

@thibaultcha thibaultcha added this to the 0.11.0 milestone May 25, 2017
@thibaultcha thibaultcha self-assigned this May 25, 2017
@thibaultcha thibaultcha force-pushed the feat/retiring-serf branch from deee788 to 689ae8c Compare May 25, 2017 22:52
end

if not ngx_debug and _init then
return error("kong.cluster_events was already instanciated")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through and caught this:

instanciated -> instantiated

Copy link
Contributor

@p0pr0ck5 p0pr0ck5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of comments, not enough time to walk deeply through everything today but wanted to get some thoughts on paper. Also, I notice several integration tests are failing consistently in a local env based on commit 689ae8c: https://gist.github.com/p0pr0ck5/6b31ddbcdf3a941aaccd3bb9314758e5

local DEBUG = ngx.DEBUG


local SHM_CACHE = "cache"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use a more narrowly namespaced name

Copy link
Member Author

@thibaultcha thibaultcha Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a namespace prefix, but the name of the shm entry, which is defined currently already in the Nginx config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, "namespace" is colloquial, referring to usage within the shm. it may be possible that other users want to use this shm? we should be a good citizen imo and be clear about what our keys are doing in the shm.

Copy link
Member Author

@thibaultcha thibaultcha Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, let's take advantage of the fact that 0.11 will have many updates to the Nginx config template already to choose a better name, and group the breaking changes in one.

I suggest we prefix all of our shms with kong_*. How about: kong_db_cache for this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Let's do that right after the merge, to have one single commit taking care of them all at once, to preserve atomicity then.

@@ -10,6 +10,7 @@ local function execute(args)
end

pcall(stop.execute, args)
ngx.sleep(0.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change seems.... tangentially related? a code comment about what's going on here (if it's necessary) would be welcome.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to do a small update here, ignore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now properly implemented with a kill -0 loop to make sure Nginx is properly terminated before trying to start it again.

kong/kong.lua Outdated
@@ -145,23 +132,18 @@ function Kong.init()
local conf_path = pl_path.join(ngx.config.prefix(), ".kong_env")
local config = assert(conf_loader(conf_path))

local events = Events() -- retrieve node plugins
local dao = assert(DAOFactory.new(config, events)) -- instanciate long-lived DAO
local dao = assert(DAOFactory.new(config)) -- instanciate long-lived DAO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instanciate -> instantiate

dao_factory.db_type)
end

-- instanciation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instantiation

local _, err = cache.sh_incr(cache_key, value)
if err then
ngx_log("[rate-limiting] could not increment counter for period '"..period.."': "..tostring(err))
local ok, err = shm:add(cache_key, 0, EXPIRATIONS[period])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, fun story: when you call shm:incr on a value, regardless of initial state, the node's expires member is unconditionally set to 0. for this current case, this call to add is meaningless, and we can stick with a single call of incr(cache_key, value, 0) below. this leaves us with the problem of keys that are only removed via LRU. :|

See openresty/lua-nginx-module#942 for more context

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember this issue being raised already. We could update this in this PR, or in another one :)

local remaining_ttl = ttl - (now() - at)

-- value_type of 0 is a nil entry
if value_type == 0 then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we should add an unmarshaller for nil types, so we avoid a branch here? such a dispatch could simply just return nil itself, so we would return the return signature of remaining_ttl, nil, nil. my guess is we'll have quite a lot of negative caches, so this branch becomes fairly unbiased (though this is a fairly subjective assumption).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not address this here, but in the actual lua-resty-mlcache repo

local CURRENT_AT_KEY = "cluster_events:at"


local POLL_LOCK_EXPTIME = 60 * 5 -- 5 mins
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we derive this from something configurable, such as db_update_frequency, rather than a hardcoded value?

local setmetatable = setmetatable


local LOCK_KEY_PREFIX = "lua-resty-mlcache:lock:"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pedantic: do we call this key lua-resty-mlcache when this lib is not technically lua-resty-mlcache?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that's fine, maybe we'll get to replacing this file with the upstream mlcache lib


-- still not in shm, we are responsible for running the callback

local ok, err = pcall(cb, ...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets say something super crappy happens here, like the callback causes the worker to crash. would we end up with a stale lock in the shm? is this a case for making the resty.lock timeout configurable, or at least thinner than the default (30 secs)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why this PR says we need to audit those settings, another small adjustment to make.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why there was a timeout value in the original implementation. In the current one with a double lock mechanism. there is a timeout for both locks as well, in case such a scenario happens.

cache:invalidate("parsed_ssl_certificates:" .. sni.name)
end
end, "crud", "ssl_certificates")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: missing second whitespace line here

Copy link
Member

@Tieske Tieske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another incomplete review. But it's a start.

Unless I'm mistaken we're using a single cache, right? Will we be making entity specific caches (on Lua land) later? eg. grabbing an api from an lru cache that also holds 10000 consumers takes longer than necessary if we would split them up.
Using multiple caches might require a different cache-key setup, as it would also need to carry the info for which entity (which cache) it is about. If so, we must make sure that the new cache-key supports that way of working.

Similar for wrapping entity specific code. Eg. consumer fetching is duplicated all over the codebase, whereas we could implement a single module for consumers that would have get_consumer_by_uuid as method and would hide all underlying invalidation, events, caches, etc. But I guess that's another refactor after this, but maybe good to agree on here.

#db_cache_miss_ttl = 300 # Time-to-live (in seconds) of a miss from the
# datastore. Such misses can be cached for a
# different period of time that hits
# (configurable via `cache_ttl`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not necessary, as long as we consistently implement the "add" hook/event for each entity. Invalidation upon adding will remove the cache-miss-sentinel.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of those TTL values are a safeguard against eventually stale data. It's a very bad idea imo to not have a TTL in our cache, and to solely rely on the invalidation events being called. It could go wrong, somehow...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but we don't need another one for the cache-miss, just use the regular one, as it is only a safeguard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine with not having it. It's just that they offer better granularity and are more future-proof, with minimum cognitive overhead from a user friendliness perspective...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about this again; shouldn't there be a setting that determines how far back events are being dealt with, than the whole expiry thing is not necessary right? if a case occurs where events are missed, then increasing that setting should suffice. If not, there is a serious bug somewhere. Correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this setting to rely on db_cache_ttl in the latest version of this branch.

# of an entity.
# Single-datacenter setups or PostgreSQL
# servers should suffer no such delays, and
# this value can be safely set to 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand what this setting does?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the description to clarify the meaning

return
end

local ok, err = timer_at(self.poll_interval, poll_handler, self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a probably non-problem here might be that the self here will never be GC'ed. So if this self object might be renewed, and disposed of during the lifetime of the worker process, then there is a problem, otherwise not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it should be fine, because we ensure that instances of this module are singletons.

-- unlock

self.shm:delete(POLL_LOCK_KEY)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you delete the lock, then another worker might do a poll in between, hence shortening the poll interval. Or am I missing something?

t in seconds;
t1 - worker 1 does a poll
t1.5 - worker 1 completes the poll, deletes the lock
t3 - worker 2 tries the lock, gets it and starts a poll
t3.5 - worker 2 completes the poll
t6 - worker 1 tries again (every 5 seconds)
etc.

the above will have done 2 polls in 5 seconds. Correct?

Possible way of working;

  • all workers start polling at 0.1 second intervals
  • worker starting a poll acquires lock, with a timeout of 5 seconds
  • does the poll thing, but never unlocks.
  • all other workers will not get the lock and hence not poll
  • first worker to try after the lock expired will do the next poll
    etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative approach could be to leverage the flags portion with a number representing ngx.now() + poll_interval, and the lock can only be acquired when the poller's ngx.now() is greater than such a value. But this becomes problematic with poll intervals < 1s, and feels kinda fishy.

Alt-alternatively, we can solve the problem of misaligned workers as @Tieske described by forcing each execution to a floor period of time modulo poll interval, e.g., with an interval of 5s, each worker is forced to run at t1, t6, t11, etc... by calculating the timer_at as a diff from now() and the next floor up. Of course, both of these scenarios imply that we care about explicitly and consistently running polls exactly at every poll_interval seconds- do we?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tieske The behavior you described was the one in place before a short review by @p0pr0ck5 a few weeks ago, in a private fork. From the top of my head, I forgot what was the original, other issue, but will try to find it, and if none, then yes, we can come back to this behavior which also has the benefit of having the smaller cognitive load.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, asked @p0pr0ck5 to remind me: it was that if a poll takes longer to execute than the lock exptime, we'd have 2 running timers with poll at the same time.

Here is a 3rd alternative: use 2 locks, one with exptime, one removed when the poll is complete by the current worker, and only allow a worker to run the lock if both lock are available.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was that if a poll takes longer to execute than the lock exptime, we'd have 2 running timers with poll at the same time.

Yes, that is the drawback indeed.

use 2 locks, one with exptime, one removed when the poll is complete by the current worker, and only allow a worker to run the lock if both lock are available.

I like this one best. As that would be the more elegant solution.

Now what are sensible defaults then? lock 1 has a 5 seconds timeout (causing a 5 second interval). Lock 2 has a 30 second (??) timeout as a safety feature? and only being able to acquire 1 lock issues a warning in the logs right?

How often will each worker try to acquire the locks to do a poll? 1/10th of the poll interval, eg. 5 seconds => 0.5 second. Then the poll interval will be between 5 and 5.5 seconds. With 4 workers the average interval time would then be 0.5/4 + 5 = 5.125 seconds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to use a double lock mechanism now

@subnetmarco
Copy link
Member

Turns out that we may want to keep Kong nodes registering to the database, and having node health-checks, to show this information in our other products.

@thibaultcha
Copy link
Member Author

thibaultcha commented Jun 2, 2017

I know, this can come in another PR though, with more meaningful data and a better and more modularized implementation.

@thibaultcha thibaultcha force-pushed the feat/retiring-serf branch 2 times, most recently from 993a0e5 to 5939253 Compare June 7, 2017 00:45

local entity_channel = fmt("%s", data.schema.table)
local entity_operation_channel = fmt("%s:%s", data.schema.table,
data.schema.operation)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big mistake. This should be data.operation. This is one of the things we'll need tests for once the design is ok for everybody.

@thibaultcha thibaultcha force-pushed the feat/retiring-serf branch from 5939253 to 09ac0c7 Compare June 7, 2017 01:01
@mwaaas
Copy link

mwaaas commented Jun 9, 2017

Is there a eta for this "With this change, Kong becomes purely stateless. " currently blocking us from using it in eb docker

Copy link

@mwaaas mwaaas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not seeing documentaion for how to use the new cache strategy

@p0pr0ck5
Copy link
Contributor

p0pr0ck5 commented Jun 9, 2017

@Tieske can you help me understand your meaning behind this comment:

eg. grabbing an api from an lru cache that also holds 10000 consumers takes longer than necessary if we would split them up.

LRU cache operations are O(1). There is no performance gain by using multiple LRU caches (if anything, it's just more complexity in management).

@thibaultcha thibaultcha force-pushed the feat/retiring-serf branch 11 times, most recently from 62fbe3f to 2cd2b6c Compare June 15, 2017 17:26
@thibaultcha thibaultcha force-pushed the feat/retiring-serf branch 2 times, most recently from 02b7ef4 to 0c67b92 Compare June 16, 2017 21:24
This module is an abstract framework that allows for broadcasting and
subscribing (pub/sub) to events across all Kong nodes connected to the
same database. It works for both PostgreSQL and Cassandra.

It consist of time series entities stored as rows inside the Kong
database. This module allows for several "channels" (although we will
most likely only use one as of now, the "invalidations" channel).

For Cassandra, each channel is a separate row in the database. The user
should have appropriate replication factor settings to ensure
availability. This makes insertion and retrieval of this time series
data particularly efficient. For Postgres, each event is its own row. We
create the necessary indexes for performance reasons. The low
cardinality of the "channels" column is helpful for such indexes.

An "event" is pure text. How said text is serialized (if complex schemas
are required) is the sole responsibility of the user.

An "event" is run once and only once per node. There is an shm tracking
the already ran event (via a v4 UUID key) for a given Nginx node.

"events" are periodically polled from a single worker via a recurring
`ngx.timer`. There is no "special worker" responsible for such polling,
to prevent a worker crash from corrupting a running node (if it stops
polling for events).

To mitigate distributed systems - multi-datacenter Cassandra cluster -
the "events" are polled with a "backwards offset" (last 60s for example)
to ensure none were missed, if they were replicated too late. This means
we fetch the same events several times, but they will only run once
anyways as previously mentioned.
@thibaultcha thibaultcha force-pushed the feat/retiring-serf branch 2 times, most recently from 4c39ac6 to 944bad5 Compare June 16, 2017 21:30
This module is to replace the current database_cache.lua module with a
nicer and more powerful interface. This module is currently being
developed as a separate, generic lua-resty library and hence no tests
from this library were inserted into Kong's test suite.

This module is to disappear in favor of a dependency to the said
lua-resty-mlcache library (once completed).
This module takes advantage of the new `kong.cluster_events` and
`lua-resty-mlcache` modules to build a more powerful caching mechanism
for Kong configuration entities (datastore entities).

We support inter-worker, inter-nodes invalidations (via
`lua-resty-worker-events` and `kong.cluster_events`), TTL, infinite TTL,
negative TTL (cache miss), better error handling, and overall
performance. It also supports a more intuitive cache "probing"
mechanism.
@thibaultcha thibaultcha force-pushed the feat/retiring-serf branch 3 times, most recently from 05fb389 to 8740757 Compare June 16, 2017 21:56
We take advantage of the new `kong.cache` module to build a better
configuration entities caching mechanism, with invalidation events that
do not use the locally running Serf agent.

This refactors all caching branches of the core entities (APIs,
Consumers, Plugins, Credentials...) and provides more reliable and
performant integration tests.

We purposefully do not remove any Serf-related code. At this stage, 2
invalidation mechanism are running side-by-side.

Additionally, this takes advantage of `lua-resty-worker-events`
supporting inter-worker and intra-worker communication to build the
foundation of a "event-driven" handlers (such as being able to
subscribe to CRUD operations in the configuration entities, for future
plugins and such). `lua-resty-worker-events` is - unfortunately -
*eventually consistent* across workers, and will *eventually* either be
improved, or replaced with a consistent - and more performant -
implementation (one is currently being developed as we speak, but future
OpenResty work may also open the way to proper IPC via cosockets).
We simply replace the underlying `database_cache.lua` dependency with
our new `kong.cache` module. There is no distinction between Lua-land or
shm cache anymore. The shm cache *is considered* the source of truth for
whether a value is cached or not because:

1) the LRU, Lua-land caching is a convenience caching (emphasized by its
   size unit not being defined in memory and potentially smaller than shm).
2) the shm/LRU lookups are of the same order of magnitude, vs the DB
   lookup being significantly higher.
Like our previous work to use the new `kong.cache` module for core
entities instead of `database_cache.lua`, this takes care of it for all
entities used in plugins (theoretically not part of Kong core). It is
also considered a separate unit of work for the sake of limiting the
patch size of a single commit, and ease future archaeology work.
Remove *all* reference to Serf from our various components: datastore,
CLI (commands), core, configuration, and Admin API. We also stop
spawning Serf as a daemon from the CLI, and remove commands that don't
make sense anymore (such as checking for daemon processes running).

From now on, only our new cache invalidations events mechanism is
running.

A follow-up commit will take care of introducing a migration removing
our `nodes` table in the datastore.

From now on, the need for a CLI to start and stop Kong is drastically
reduced, almost nullified: assuming the Nginx configuration is written
and exists in the prefix, Kong can start via the Nginx binary directly,
and be the foreground process in container deployments.

This opens the way to better Nginx integration, such as described in

:fireworks:
3 new properties now define the behavior of the new invalidation events
framework, and configuration entities caching. Their role and behavior
is detailed in this commit, as part of the default configuration file.
@thibaultcha thibaultcha force-pushed the feat/retiring-serf branch 2 times, most recently from 68a9bb2 to dd6f5f8 Compare June 16, 2017 22:56
When using Cassandra, one must not leave the `db_update_propagation` to
its default of 0, at the risk of missing invalidation events.

This logs a warning message in the CLI. The "special format" that
warrants a timestamp, log level, and increase visibility of a log in the
CLI was downgraded to "warn".

In the future, we will need to make this logging facility log in ngx_lua
with the proper level.
Copy link
Contributor

@p0pr0ck5 p0pr0ck5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YOLO

@thibaultcha thibaultcha merged commit 055e3af into next Jun 16, 2017
@thibaultcha thibaultcha deleted the feat/retiring-serf branch June 16, 2017 23:45
@olderwei
Copy link

since the version 0.11 has cancelled the command of kong cluster, how do I know which nodes make up a cluster

@thibaultcha
Copy link
Member Author

@olderwei In 0.11 you cannot: all Kong nodes are purely stateless. The kong cluster command was only useful for debugging purposes when a Kong node's Serf agent seemed to not be receiving events. You should use orchestration tools to manage your cluster and have visibility into its current state.

@olderwei
Copy link

@thibaultcha so do you have any recommendations about the orchestration tools?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/ready (but hold merge) No more concerns, but do not merge yet (probably a conflict of interest with another PR or release)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants