Skip to content
This repository was archived by the owner on Jun 20, 2024. It is now read-only.

Operational guide #1978

Merged
merged 1 commit into from
Jun 3, 2016
Merged

Operational guide #1978

merged 1 commit into from
Jun 3, 2016

Conversation

awh
Copy link
Contributor

@awh awh commented Feb 17, 2016

An operational guide that describes likely deployment scenarios and administrative tasks.

Fixes #726, fixes #1102

@awh awh added this to the 1.5.0 milestone Feb 17, 2016
@rade
Copy link
Member

rade commented Feb 17, 2016

cross-DC deployments may exhibit some additional patterns

@awh awh force-pushed the weave-1.5-doc-updates branch from e4d70ed to 03d5177 Compare February 17, 2016 18:18
@awh awh changed the title Documentation updates for 1.5 release [WIP] Documentation updates for 1.5 release Feb 18, 2016
@awh awh self-assigned this Feb 18, 2016
@awh awh force-pushed the weave-1.5-doc-updates branch 2 times, most recently from 99053de to f4ff5a2 Compare March 9, 2016 17:44
@rade
Copy link
Member

rade commented Mar 10, 2016

  • Ideally we'd try to avoid mentioning the notion of 'leader', since it can easily leave readers with the wrong impression that weave funnels some of its operations through a single elected node
  • we may want to mention weave stop earlier, in the Removing a peer section, to explain that it is not the same (and serves a different purpose) to weave reset
  • in the 'Removing a Peer' sections, it may be worth mentioning that weave forget isn't necessary for correctness.
  • s/Removing a Node/Removing a Peer/
  • 'Recovering Lost IPAM Space' - we should explain how to discover that space may have been lost
  • 'Recovering Lost IPAM Space' - it would be better to rename, and perhaps split, this section so the title refers to something that has happened / some state, e.g. "peer failure", "removing a peer while partitioned".
  • weave status during bootstrapping a uniform fixed cluster - that's rather awkward to automate.

It does rather strike me that we'd benefit from connect and forget propagating through the cluster.

@awh
Copy link
Contributor Author

awh commented Mar 10, 2016

Discuss clock skew

@awh awh force-pushed the weave-1.5-doc-updates branch from 986f0c1 to 2ff9269 Compare March 16, 2016 14:32
> Author's Note: the rationale for `weave status` is to introduce an
> opportunity for the user to resolve any initial connectivity
> problems before consensus - this ensures the ring is divided as
> evenly as possible, delaying the need for donations

This comment was marked as abuse.

This comment was marked as abuse.

@awh
Copy link
Contributor Author

awh commented Apr 11, 2016

Ideally we'd try to avoid mentioning the notion of 'leader', since it can easily leave readers with the wrong impression that weave funnels some of its operations through a single elected node

This is hard!

we may want to mention weave stop earlier, in the Removing a peer section, to explain that it is not the same (and serves a different purpose) to weave reset

I have tackled this by adding a 'Stopping a Peer' section before 'Removing a Peer' - see what you think.

in the 'Removing a Peer' sections, it may be worth mentioning that weave forget isn't necessary for correctness.

Amended.

s/Removing a Node/Removing a Peer/

Amended.

'Recovering Lost IPAM Space' - we should explain how to discover that space may have been lost
'Recovering Lost IPAM Space' - it would be better to rename, and perhaps split, this section so the title refers to something that has happened / some state, e.g. "peer failure", "removing a peer while partitioned".

I've had a stab at this - see what you think.

weave status during bootstrapping a uniform fixed cluster - that's rather awkward to automate.

Removed.

It does rather strike me that we'd benefit from connect and forget propagating through the cluster.

@awh awh force-pushed the weave-1.5-doc-updates branch from 2ff9269 to 1733fbc Compare April 11, 2016 16:06
@awh
Copy link
Contributor Author

awh commented Apr 11, 2016

@bboreham unfortunately they have been collapsed due to out-of-date diffs, but I have responded to all your comments!


A network partition is a transient condition whereby some arbitrary
subsets of peers are unable to communicate with each other for the
duration - perhaps because a router has failed, or a fibre optic line

This comment was marked as abuse.

This comment was marked as abuse.

@awh awh mentioned this pull request Apr 12, 2016
@awh awh changed the title [WIP] Documentation updates for 1.5 release Documentation updates for 1.5 release Apr 12, 2016
@awh awh changed the title Documentation updates for 1.5 release Operational guide for 1.5 release Apr 12, 2016
@awh
Copy link
Contributor Author

awh commented Apr 12, 2016

I think this is approaching MVP status! There are a number of author's notes left in at this stage, but some of them contain useful explicatory material that I am loathe to remove; am minded to keep some of them in as hidden inline comments somehow, perhaps polishing others for external consumption as quoted blocks. Thoughts?

based bootstrapping requires each peer to be told the total number of
expected peers (the 'initial peer count') in order to avoid small
independent groups of peers from electing different leaders under
conditions of partition.

This comment was marked as abuse.

This comment was marked as abuse.

@awh awh changed the title Operational guide for 1.5 release Operational guide Apr 18, 2016

## Removing a Peer

On peer to be removed:

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

@awh awh force-pushed the weave-1.5-doc-updates branch from c23bab8 to 55095b3 Compare May 31, 2016 16:32
@awh awh removed their assignment May 31, 2016
partition, you can reclaim their space by following the advice in the
next section.

## Reclaiming Lost IP Address Space

This comment was marked as abuse.

This comment was marked as abuse.

@abuehrle
Copy link
Contributor

Would you like me to edit these files before you release them?

@awh
Copy link
Contributor Author

awh commented Jun 1, 2016

Would you like me to edit these files before you release them?

@abuehrle absolutely yes - the technical aspects of the content are nearly finalised now - will ping you again shortly.

restarts, it is essential that it is unique - if two or more peers
share the same name chaos will ensue, including but not limited to
double allocation of addresses and inability to route packets on the
overlay network. Consequently when the router is launched on a host it

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

@awh awh force-pushed the weave-1.5-doc-updates branch from 61fef19 to 7533754 Compare June 3, 2016 12:57
@bboreham
Copy link
Contributor

bboreham commented Jun 3, 2016

Is this ready for @abuehrle to edit now?

@rade
Copy link
Member

rade commented Jun 3, 2016

Is this ready for @abuehrle to edit now?

The plan is:

  1. merge Peer list persistence #2305 (this PR assume that is in place)
  2. merge this
  3. @abuehrle can edit


## Bootstrap

On each initial peer, at boot, via

This comment was marked as abuse.

This comment was marked as abuse.

@awh awh force-pushed the weave-1.5-doc-updates branch from 7533754 to e092a51 Compare June 3, 2016 16:14
@awh
Copy link
Contributor Author

awh commented Jun 3, 2016

@bboreham addressed your most recent comments - PTAL

@bboreham bboreham merged commit 5dd7590 into master Jun 3, 2016
@awh awh deleted the weave-1.5-doc-updates branch June 6, 2016 15:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide upgrade guidance [docs] Deployment / reboot docs could be improved
5 participants