Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typed Bucket Replication Support #486

Closed
wants to merge 3 commits into from

Conversation

bowrocker
Copy link
Contributor

Overview

Riak 2.0 brings with it typed buckets -- the ability to create a type and associate a bucket with that type:

basho/riak#362

In order to successfully replicate a typed object from one cluster to another, the type definition must exist and be equal on both clusters. MDC replication must handle the possibility that the type of a given object exists on the replication source cluster, but not on the sink cluster (repl 2 terminology).

At this time, there is no facility to automatically create type across clusters -- for example, create a type that does not exist on the sink cluster -- to facilitate replication happening seemlessly. This could be an avenue to explore.

Additionally, replication must be extended to properly handle typed buckets. This work is assumed to come along with support for typed-bucket replication. "Default" type buckets (legacy buckets in Riak) continue to be automatically replicated as before.

This document discusses 2 possible options for implementing this in replication:

https://gist.github.com/bowrocker/7f0d9d6879493f1ac0e9

This PR implements the second option.

Type Checking on the Sink

This option allows typed buckets to be replicated to the sink cluster without any checking on the source. During the do_repl_put on the sink, the bucket is checked. If it has a type, that type is checked for existence, and if it does not exist, the object is dropped, and an error message is printed, and preferably an alarm of some sort is sent.

This option assumes that types should exist on both source and sink clusters, and that the non-existance of a type is an operational error that the user or CSE should take care of. Once the type is created, object of that type will replicate correctly.

Pros
  • This is the simplest solution. Very few moving parts, very low impact on operations and performance.
Cons
  • Bandwidth, RTQ space, and processing is wasted when an object whose type does not exist on the sink cluster is replicated.
  • The assumption that all types should exist on both sides.

Configuration

No new user configuration is introduced. If the sink is not configured with the type of an object being replicated, an error message is printed. It is the responsibility of the user to provision this type on the sink.

Mixed version replication

A new replication protocol version is introduced for bucket type support: {2,1}. The following is supported for each version:

Source Sink Result
{2,1} {2,1} "default" and user defined typed buckets replicated
{2,1} {2,0} or less only "default" buckets replicated

Dependencies

This PR depends upon code in the following riak_kv branch to be merged, or the conditional post-commit hook present will not work:

https://github.com/basho/riak_repl/tree/jdb-conditional-postcommit

Original PR

#477

for 2.0

changed bucket type checking to use meta data to send Bucket type properties and compare

fixed compile error

added backwardly compatible meta-data handling

took out commented line
case riak_object:bucket(Obj) of
{Type, _B} ->
AllProps = riak_core_bucket_type:get(Type),
PropsHash = erlang:phash2(proplists:delete(claimant, AllProps)),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably why long term its better to have a "whitelist" of properties we include in the hash rather than a "blacklist" of ones we don't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed -- we should discuss once we have an idea of what those properties are, then we get do it that way, I'd greatly prefer that.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we sort of need to discuss soon don't we :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but we can probably use what's here for now, then open an issue to fix it...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well i guess it depends. its getting kinda of late for the open an issue thing (that basically means its a bug for 2.0 and a fix in 2.0.1 or 2.1). One immediate problem I see is if a user has properties that are only meaninful to them -- bucket (type) properties support arbitrary key/value pairs. By not limiting this list we open ourselves to all sorts of weirdness there.

I don't think it would be too bad to gather up a list of things that we should check. Previously repl did no checking iirc so its kind of hard for us to get it wrong isn't it? Off the top of my head the ones that probably matter are:

  • consistent
  • datatype
  • n_val
  • allow_mult
  • last_write_wins

Those are the ones I can think of that affect how data is stored. We shouldn't require that users have the same default w or r values on both sides, etc.

@jrwest
Copy link

jrwest commented Dec 13, 2013

I also wonder if we should make the hashes of the properties exposed to the user so that when things aren't replicating they can check whether or not they in fact match or not on both sides.

@bowrocker
Copy link
Contributor Author

Closing -- new PR here:

#490

@bowrocker bowrocker closed this Dec 16, 2013
@seancribbs seancribbs deleted the feature/jra/repl_bucket_types_1 branch April 1, 2015 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants