Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support joining nodes to an existing cluster #3372

Merged
merged 32 commits into from
Jul 23, 2015
Merged
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
abfd438
Implement remote meta data fetch
jwilder Jul 14, 2015
e06f6f4
Add ability to join an existing cluster
jwilder Jul 15, 2015
3f90891
Proxy raft and rpc connections to leader transparently
jwilder Jul 15, 2015
9db3a77
Assign and store a node ID when joining an existing cluster
jwilder Jul 15, 2015
cb718b9
Only allow adding non-raft nodes for now
jwilder Jul 16, 2015
43f0407
Ensure a node can join an existing single node raft cluster
jwilder Jul 16, 2015
5486d3e
Move invalidate to raft state
jwilder Jul 16, 2015
ad8948b
Fix up rpc error handling
jwilder Jul 16, 2015
a7fa5eb
Propogate metadata changes from raft nodes to non-raft nodes
jwilder Jul 16, 2015
f3fcfeb
Make raftState interface private
jwilder Jul 17, 2015
5ea8342
Move raft state to separate file
jwilder Jul 17, 2015
fb8a4db
Move raft closing to localRaft state
jwilder Jul 17, 2015
33730da
Move isLeader to raft state
jwilder Jul 17, 2015
9e43397
Move leaderCh() to raft state
jwilder Jul 17, 2015
b86fecf
Move setPeers to raft state
jwilder Jul 17, 2015
80248f9
Remote leaderCh
jwilder Jul 17, 2015
72e2e1a
Move addPeer to raft state
jwilder Jul 17, 2015
17a9bb0
Remove raftEnabled func
jwilder Jul 17, 2015
a9314d6
Move raft index to raft state
jwilder Jul 17, 2015
54e1165
Move apply to raft state
jwilder Jul 17, 2015
790733d
Move snapshot to raft state
jwilder Jul 17, 2015
85db9c4
Move remaining raft impl details to local raft state
jwilder Jul 17, 2015
29b11a2
Support multiple comma-separated join addresses
jwilder Jul 17, 2015
c1fc83e
Make join private so it does not show up in config command
jwilder Jul 17, 2015
84a8d7d
Add cluster-tracing option to meta config
jwilder Jul 17, 2015
b78ac4b
Add RPC tests
jwilder Jul 20, 2015
29011c5
Code review fixes
jwilder Jul 20, 2015
eb7d181
Fix race in test code
jwilder Jul 20, 2015
47b8de7
Hide Meta.Join from config command using toml skip annotation
jwilder Jul 22, 2015
e904416
Invalidate raft member by fetching from leader
jwilder Jul 22, 2015
9dd66fa
Make meta RPC private
jwilder Jul 23, 2015
3c308e3
Update changelog
jwilder Jul 23, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add cluster-tracing option to meta config
Useful for troubleshooting but too verbose for regular use.
  • Loading branch information
jwilder committed Jul 23, 2015

Verified

This commit was signed with the committer’s verified signature.
lann Lann
commit 84a8d7d24be548c818becd0263bf64faa48eb89d
4 changes: 2 additions & 2 deletions cmd/influxd/restore/restore.go
Original file line number Diff line number Diff line change
@@ -229,8 +229,8 @@ restore uses a snapshot of a data node to rebuild a cluster.

// Config represents a partial config for rebuilding the server.
type Config struct {
Meta meta.Config `toml:"meta"`
Data tsdb.Config `toml:"data"`
Meta *meta.Config `toml:"meta"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it will become apparent later, but why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to set a private join value on the config if it's specified via the command-line.

Data tsdb.Config `toml:"data"`
}

type nopListener struct {
2 changes: 1 addition & 1 deletion cmd/influxd/run/config.go
Original file line number Diff line number Diff line change
@@ -24,7 +24,7 @@ import (

// Config represents the configuration format for the influxd binary.
type Config struct {
Meta meta.Config `toml:"meta"`
Meta *meta.Config `toml:"meta"`
Data tsdb.Config `toml:"data"`
Cluster cluster.Config `toml:"cluster"`
Retention retention.Config `toml:"retention"`
7 changes: 4 additions & 3 deletions meta/config.go
Original file line number Diff line number Diff line change
@@ -37,17 +37,18 @@ type Config struct {
HeartbeatTimeout toml.Duration `toml:"heartbeat-timeout"`
LeaderLeaseTimeout toml.Duration `toml:"leader-lease-timeout"`
CommitTimeout toml.Duration `toml:"commit-timeout"`
ClusterTracing bool `toml:"cluster-tracing"`

// The join command-line argument
join string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this unexported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's set as a command-line arg. It it's public, influxd config lists it as a config option which is not valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do toml:"-"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try. But the Config still needs to be mutable.

}

func (c Config) SetJoin(join string) {
func (c *Config) SetJoin(join string) {
c.join = join
}

func NewConfig() Config {
return Config{
func NewConfig() *Config {
return &Config{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why return a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The config needs to be modified if -join is specified now.

Hostname: DefaultHostname,
BindAddress: DefaultBindAddress,
RetentionAutoCreate: true,
17 changes: 13 additions & 4 deletions meta/rpc.go
Original file line number Diff line number Diff line change
@@ -16,7 +16,8 @@ import (

// RPC handles request/response style messaging between cluster nodes
type RPC struct {
Logger *log.Logger
Logger *log.Logger
tracingEnabled bool

store interface {
cachedData() *Data
@@ -66,7 +67,8 @@ func (r *RPC) handleRPCConn(conn net.Conn) {
// RPC connections should execute on the leader. If we are not the leader,
// proxy the connection to the leader so that clients an connect to any node
// in the cluster.
r.Logger.Printf("rpc connection from: %v", conn.RemoteAddr())
r.traceCluster("rpc connection from: %v", conn.RemoteAddr())

if !r.store.IsLeader() {
r.proxyLeader(conn.(*net.TCPConn))
return
@@ -90,7 +92,7 @@ func (r *RPC) handleRPCConn(conn net.Conn) {
rpcType := internal.RPCType(btou64(buf[0:8]))
buf = buf[8:]

r.Logger.Printf("recv %v request from: %v", rpcType, conn.RemoteAddr())
r.traceCluster("recv %v request on: %v", rpcType, conn.RemoteAddr())
switch rpcType {
case internal.RPCType_FetchData:
var req internal.FetchDataRequest
@@ -147,6 +149,7 @@ func (r *RPC) sendResponse(conn net.Conn, typ internal.RPCType, resp proto.Messa
}

func (r *RPC) sendError(conn net.Conn, msg string) {
r.traceCluster(msg)
resp := &internal.ErrorResponse{
Header: &internal.ResponseHeader{
OK: proto.Bool(false),
@@ -195,7 +198,7 @@ func (r *RPC) handleFetchData(req *internal.FetchDataRequest) (*internal.FetchDa

// handleJoinRequest handles a request to join the cluster
func (r *RPC) handleJoinRequest(req *internal.JoinRequest) (*internal.JoinResponse, error) {
r.Logger.Printf("recv join request from: %v", *req.Addr)
r.traceCluster("join request from: %v", *req.Addr)

node, err := func() (*NodeInfo, error) {
// attempt to create the node
@@ -401,6 +404,12 @@ func (r *RPC) call(dest string, req proto.Message) (proto.Message, error) {
return resp, nil
}

func (r *RPC) traceCluster(msg string, args ...interface{}) {
if r.tracingEnabled {
r.Logger.Printf("rpc error: "+msg, args...)
}
}

func u64tob(v uint64) []byte {
b := make([]byte, 8)
binary.BigEndian.PutUint64(b, v)
14 changes: 10 additions & 4 deletions meta/store.go
Original file line number Diff line number Diff line change
@@ -82,6 +82,10 @@ type Store struct {
wg sync.WaitGroup
changed chan struct{}

// clusterTracingEnabled controls whether low-level cluster communcation is logged.
// Useful for troubleshooting
clusterTracingEnabled bool

retentionAutoCreate bool

// The listeners to accept raft and remote exec connections from.
@@ -123,7 +127,7 @@ type authUser struct {
}

// NewStore returns a new instance of Store.
func NewStore(c Config) *Store {
func NewStore(c *Config) *Store {
s := &Store{
path: c.Dir,
peers: c.Peers,
@@ -135,7 +139,8 @@ func NewStore(c Config) *Store {
closing: make(chan struct{}),
changed: make(chan struct{}),

retentionAutoCreate: c.RetentionAutoCreate,
clusterTracingEnabled: c.ClusterTracing,
retentionAutoCreate: c.RetentionAutoCreate,

HeartbeatTimeout: time.Duration(c.HeartbeatTimeout),
ElectionTimeout: time.Duration(c.ElectionTimeout),
@@ -150,8 +155,9 @@ func NewStore(c Config) *Store {

s.raftState = &localRaft{store: s}
s.rpc = &RPC{
store: s,
Logger: s.Logger,
store: s,
tracingEnabled: c.ClusterTracing,
Logger: s.Logger,
}
return s
}
6 changes: 3 additions & 3 deletions meta/store_test.go
Original file line number Diff line number Diff line change
@@ -817,7 +817,7 @@ type Store struct {
}

// NewStore returns a new test wrapper for Store.
func NewStore(c meta.Config) *Store {
func NewStore(c *meta.Config) *Store {
s := &Store{
Store: meta.NewStore(c),
}
@@ -881,8 +881,8 @@ func (s *Store) Close() error {
}

// NewConfig returns the default test configuration.
func NewConfig(path string) meta.Config {
return meta.Config{
func NewConfig(path string) *meta.Config {
return &meta.Config{
Dir: path,
Hostname: "localhost",
BindAddress: "127.0.0.1:0",