Explore storage plugins #1681

gouthamve · 2019-09-18T15:02:18Z

So there are 4 issues open to support new storage types:

As we increase usage I can only imagine new requests coming up. Further, we're re-using cortex in Loki a fair bit and are finding that the storage trade-offs in Loki and Cortex are different and I'm seeing some implementation details for Loki leaking into Cortex.

I think we need a good way to come up with a plugin system that lets the new storage codes live outside the cortex tree and I think the first obvious candidate is the grpc-plugin system from Hashicorp, that has been used to do similar storage plugins for Jaeger.

bboreham · 2019-09-27T17:05:23Z

I was looking for pros and cons of the Hashicorp approach, and found this: jaegertracing/jaeger#422 (comment), linking to some Go issues including golang/go#20481

So I won't be advocating to use Go plugins.

sharkymcdongles · 2019-10-31T13:43:39Z

"While the plugin system is over RPC, it is currently only designed to work over a local [reliable] network. Plugins over a real network are not supported and will lead to unexpected behavior."

jtlisi · 2019-10-31T15:35:38Z

I don’t think we should use plugins at all. Instead I propose we make a GRPC service definition for the chunk/series store. Then we have a generic client. That way anyone can implement any backend by writing a grpc based service.

I should clarify, my main issue with plugins is it obfuscates the boundaries of the application and makes performance monitoring and tracing harder. Better to just break it off into a separate service.

pracucci · 2019-10-31T15:37:44Z

👍 on @jtlisi's feedback. I would also suggest to keep a well defined boundary between the two, and use GRPC for the communication between Cortex (client) and the service provider.

yurishkuro · 2019-11-15T20:00:54Z

@jtlisi GRPC service is exactly what hashicorp plugin framework does. It just adds a bit of extra management so that the "plugin" binary can run as a child process using gRPC over unix socket. But the GRPC interface itself can also be used with remote processes.

KalanaDananjaya · 2019-12-17T11:50:33Z

Hi, I saw this project is available under community bridge mentorship. I would like to work on this if available. Can you tell me its status @gouthamve

gouthamve · 2020-02-05T06:52:59Z

In the last community call, we discussed this and @tomwilkie raised a few objections to the proposal. I am trying to consolidate the discussion here and want to see how others feel about it:

See design doc here: https://docs.google.com/document/d/1avgFHm4bOlP_h692tlHudBaTFCXHSo4Gt1SE66as1aw/edit

Now, the idea is that people want a myriad number of storage plugins from MySQL to TiKV to Elasticsearch. But none of the maintainers will be using those stores and it is in general not viable to expect the community to support it. A good example of it the Prometheus Service Discovery solutions. Hence we thought that allowing them via gRPC based plugins is a good idea.

But Tom made some good counterpoints against plugins:

The current interfaces are not the right ones and having plugins stops us from breaking them
We should not be a closed ecosystem but rather be more welcome to everyone and just mark the compnents as experimental. If something is broken, we just remove it. Because its marked experiemental, it's not a breaking change that warrants a version bump.

While they do have merits, I don't think we should be super open. If we want to add new features, we need to add it to everyone, while doing it against a KV store would be easy, it would also mean understanding and testing against Elastic. The same interface breakage would mean you need to change things across potentially 5-6 storages now.

The current interfaces are not the right ones and having plugins stops us from breaking them

Actually it doesn't. We mark the whole plugin interface as experimental and the plugin authors would need to change things in the plugins as the interfaces change. We need to be careful to call the changes out in the changelog and probably also reach out to the plugin authors. For example, I recently found out that Jaeger simply changed the interface.

And the frequency of the changes is extremely rare and I don't think it would happen very frequently for it to be a problem. For example, the last change to the interfaces was 1yr+ ago.

All in all, I think we should proceed with the idea of plugins. And further, I would even argue that gRPC plugin is just another kind of storage like Elasticsearch, MySQL, TiKV etc and maybe some folks would want to use Cortex with an internal custom db, and plugins would let them do that.

/cc @tomwilkie @VineethReddy02

pracucci · 2020-02-05T11:47:57Z

I'm in favour of the plugin system.

The plugin system allows external contributors to faster iterate over the support of new backend storages without waiting for Cortex maintainers to review it (and in some cases we may have no Cortex maintainer with enough knowledge on the specific storage to correctly review it).

The plugin system doesn't prohibit us to merge it upstream once it's consolidated and we see a significant usage from the community.

Having a plugin system doesn't mean we're a closed ecosystem. To me, it's the exact opposite. We're an open one and we're willing to put our effort to make Cortex extensible to use cases beyond the ones we've seen so far.

I've personally worked with plugin-based systems in the past (ie. Logstash for logs collection) and I successfully leveraged on niche features (built into 3rd party plugins) that otherwise would have difficulty hit the upstream version.

The current interfaces are not the right ones and having plugins stops us from breaking them

If so, then we should refactor them before introducing the plugin system.

stale · 2020-04-05T11:59:07Z

This issue has been automatically marked as stale because it has not had any activity in the past 60 days. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

gouthamve mentioned this issue Sep 18, 2019

Add support for Azure Blob Storage #1234

Closed

chancez mentioned this issue Sep 27, 2019

SQL index store for cortex #1419

Closed

VineethReddy02 mentioned this issue Mar 5, 2020

gRPC Storage Service #2220

Merged

3 tasks

stale bot added the stale label Apr 5, 2020

pracucci added the keepalive Skipped by stale bot label Apr 6, 2020

stale bot removed the stale label Apr 6, 2020

pracucci closed this as completed in #2220 May 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore storage plugins #1681

Explore storage plugins #1681

gouthamve commented Sep 18, 2019 •

edited by bboreham

Loading

bboreham commented Sep 27, 2019

sharkymcdongles commented Oct 31, 2019

jtlisi commented Oct 31, 2019 •

edited

Loading

pracucci commented Oct 31, 2019

yurishkuro commented Nov 15, 2019

KalanaDananjaya commented Dec 17, 2019

gouthamve commented Feb 5, 2020 •

edited

Loading

pracucci commented Feb 5, 2020 •

edited

Loading

stale bot commented Apr 5, 2020

Explore storage plugins #1681

Explore storage plugins #1681

Comments

gouthamve commented Sep 18, 2019 • edited by bboreham Loading

bboreham commented Sep 27, 2019

sharkymcdongles commented Oct 31, 2019

jtlisi commented Oct 31, 2019 • edited Loading

pracucci commented Oct 31, 2019

yurishkuro commented Nov 15, 2019

KalanaDananjaya commented Dec 17, 2019

gouthamve commented Feb 5, 2020 • edited Loading

pracucci commented Feb 5, 2020 • edited Loading

stale bot commented Apr 5, 2020

gouthamve commented Sep 18, 2019 •

edited by bboreham

Loading

jtlisi commented Oct 31, 2019 •

edited

Loading

gouthamve commented Feb 5, 2020 •

edited

Loading

pracucci commented Feb 5, 2020 •

edited

Loading