Skip to content

Basho/merge forward 2.0to2.1 from gb8cfd20 #707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Aug 7, 2015

Conversation

jonmeredith
Copy link
Contributor

Pick up AAE fullsync bucket type fixes.

macintux and others added 30 commits August 19, 2014 10:30
…t were unintentionally attributed to the first function
Various fixes for edoc

Reviewed-by: reiddraper
Problem: transient failures of aae, such as trees not yet built or locks not
being aquired, would cause an aae fullsync process to exit abnormally. This
could happen several times in a row, creating log spam.

Resolution: the concept of soft_exit. A soft_exit is a message sent from a soon
to be exiting process to a soft_linked process. The exiting process would then
exit normally, while any soft_linked processes could handle the soft_exit
message in a similar fashion as an exit message. This would indicate an exit
reason that should be handled, but not bad enough to have the system logger
know about it.

The soft_exit message sent from the aae worker to the fscoordinator is
as simple as `{soft_exit, pid(), term()}'.

The current implementation is not generic. There can only one soft_link to
the aae, and there's no general mechanism to use soft_link's or soft_exits
elsewhere in the code base. Sorry.

Another change rolled into this is consistent use of a #partition_info record
in the fscoordinator, and error tracking the fscoordinator's state. By swapping
to useing a single data structure in the partition queue, whereis waiting list,
and purgatory queues it makes it easier to understand the fscordinator (as
there is less code modify structures).

This is a forward port of the fix done for 1.4. Conflicts favor existing code
where it does not directly effect the fix.

Conflicts:
	Makefile
	rebar.config
	src/riak_repl2_fssource.erl
	src/riak_repl2_rtq_proxy.erl
	src/riak_repl_aae_source.erl
	test/riak_core_cluster_mgr_tests.erl
Increment_error_dict expects the partition, elementN of error dict, and the
state. It pulls the dict out of the state so it put it back in place, thus just
returning the state. So this call that passed the dict in was wrong.
When a partition is not available, perhaps after a number of retries,
the error exits stat should be incremented. Also, the retry exits stat
should be incremented on each retry.  This was discovered when
backporting the repl_location_failures riak_test.
The one in riak_repl2_fssource is a legit bug in the code
…nsient-aae-fs-failures

Implement soft_exit, primarily for aae_fullsyn.

Reviewed-by: engelsanchez
Conflicts:
	dialyzer.ignore-warnings
	rebar.config
Conflicts:
	src/riak_repl2_fscoordinator.erl
Develop 2.0 merge.

Reviewed-by: seancribbs
Add support for Erlang 17.

Reviewed-by: andrewjstone
…y. Fixes function clause issue found in riak_test testing (BTA-202).
…y handling DOWN/normal messages in receive
- riak_repl2_fscoordinator tried to cache the owners of partitions, but if a vnode was handed off during the fullsync run, it would never be transferred because it keeps trying on the old node.
…andoff_occurs

Fix deadlocks when handoff occurs during fullsync, and function clause mismatch in wait_keylist

Reviewed-by: seancribbs
Add msg_q length of rtsink_helper to status

Reviewed-by: bsparrow435
kuenishi and others added 12 commits May 13, 2015 12:26
Improve CS replication on blocks
Update the bucket properties hash to take the write_once bucket property
into consideration; additionally, remove the allow_mult and
last_write_wins restrictions.
Update bucket properties hash for write_once.

Reviewed-by: jonmeredith
Revert "Update bucket properties hash for write_once."

Reviewed-by: jonmeredith
Forward merge 2.0.6

Reviewed-by: JeetKunDoug
Remove extraneous noreply from riak_repl2_rtsource_conn:reconnect

Reviewed-by: lordnull
@jonmeredith
Copy link
Contributor Author

+1 c388b2d

considering it part of same review as merge forward.

borshop added a commit that referenced this pull request Aug 7, 2015
…gb8cfd20

Basho/merge forward 2.0to2.1 from gb8cfd20

Reviewed-by: jonmeredith
@jonmeredith
Copy link
Contributor Author

@borshop merge

@borshop borshop merged commit c388b2d into 2.1 Aug 7, 2015
@jonmeredith jonmeredith deleted the basho/merge-forward-2.0to2.1-from-gb8cfd20 branch August 7, 2015 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants