Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup & Restore #2053

Merged
merged 8 commits into from
Mar 25, 2015
Merged

Backup & Restore #2053

merged 8 commits into from
Mar 25, 2015

Conversation

benbjohnson
Copy link
Contributor

Overview

This pull request adds the ability to snapshot a single data node at a point-in-time and restore it.

Usage

While a data node is running, you can create a hot backup to a snapshot file (mysnapshot):

$ influxd backup mysnapshot

By default, this can only be run from the data node itself. See configuration options below to snapshot from another machine.

Once you have your snapshot file, you can copy it to another machine and restore it:

$ influxd restore -config influxdb.conf mysnapshot

This command will remove the broker and data directories listed in the configuration file provided and replace them with the data in the snapshot. Once the restore is complete, you can start the influxd server normally.

Configuration Options

A configuration section has been added for the snapshot handler with the following defaults:

[snapshot]
bind-address = "127.0.0.1"
port = 8087

The bind address restricts snapshot so they can only be run from the local machine.

API

The following are the primary API additions:

Snapshots

// Snapshot represents the state of the Server at a given time.
type Snapshot struct {
    Files []SnapshotFile `json:"files"`
}

// SnapshotFile represents a single file in a Snapshot.
type SnapshotFile struct {
    Name  string `json:"name"`  // filename
    Size  int64  `json:"size"`  // file size
    Index uint64 `json:"index"` // highest index applied
}

Snapshot Reader

// SnapshotReader reads a snapshot from a Reader.
type SnapshotReader struct

// Snapshot returns the snapshot meta data.
func (sr *SnapshotReader) Snapshot() (*Snapshot, error)

// Next returns the next file in the snapshot.
func (sr *SnapshotReader) Next() (SnapshotFile, error)

// Read reads the current entry in the snapshot.
func (sr *SnapshotReader) Read(b []byte) (n int, err error)

Snapshot Writer

// SnapshotWriter writes a snapshot and the underlying files to disk as a tar archive.
type SnapshotWriter struct {
    // The snapshot to write from.
    // Removing files from the snapshot after creation will cause those files to be ignored.
    Snapshot *Snapshot

    // Writers for each file by filename.
    // Writers will be closed as they're processed and will close by the end of WriteTo().
    FileWriters map[string]SnapshotFileWriter
}

// WriteTo writes the snapshot to the writer.
// File writers are closed as they are written.
// This function will always return n == 0.
func (sw *SnapshotWriter) WriteTo(w io.Writer) (n int64, err error)

// SnapshotFileWriter is the interface used for writing a file to a snapshot.
type SnapshotFileWriter interface {
    io.WriterTo
    io.Closer
}

Server Snapshot Writer Creation

// CreateSnapshotWriter returns a writer for the current snapshot.
func (s *Server) CreateSnapshotWriter() (*SnapshotWriter, error)

Implementation

The snapshot file is a single tar archive that contains a manifest file at the beginning, the data node's meta file next, and then a list of all shard files. The metastore and shards all use Bolt so they contain a point-in-time copy of the database when the backup was initiated.

The broker node is not backed up because it can be materialized from the data in the data node. The restore command generates a broker meta store based on the highest index in the data node and generates a raft configuration based on the InfluxDB config passed in.

Caveats

This approach currently only works in clusters where the replication factor is the same as the number of nodes in the cluster. A cluster wide backup and restore will be done in the future.

TODO

  • Incremental backup

Fixes: #1468 #1947

This commit adds the backup command to the influxd binary as well as
implements a SnapshotWriter in the influxdb package.

By default the snapshot handler binds to 127.0.0.1 so it cannot be
accessed outside of the local machine.
This commit adds the "influxd restore" command to the CLI. This allows
a snapshot that has been produced by "influxd backup" to be restored
to a config location and the broker and raft directories will be
bootstrapped based on the state of the snapshot.
…ckup-restore

Conflicts:
	cmd/influxd/main.go
	cmd/influxd/run.go
} else if err != nil {
return fmt.Errorf("next: entry=%s, err=%s", sf.Name, err)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it log output here just to let the user know that the file is being done? That way they get incremental output as the restore progresses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the unpack() call should take a progress callback. I've seen this pattern other places, but can't remember where.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added progress: 4bc92c3

@pauldix
Copy link
Member

pauldix commented Mar 23, 2015

is it possible to specify the path for the snapshot so you can save it on another volume?

@benbjohnson
Copy link
Contributor Author

@pauldix Yep, that's supported. Sorry that wasn't more clear in the PR description:

$ influxd backup /path/to/my/snapshot


backup downloads a snapshot of a data node and saves it to disk.

-host <url>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: would this help output be more accurate if it read -host <host:port> ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that it is used as part of a URL could be considered an implementation detail, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We parse it as a URL so it needs to specify scheme as well (http or https).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the default shown should include the scheme, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: 4bc92c3


// Ensure the backup returns an error if it cannot connect to the server.
func TestBackupCommand_ErrConnectionRefused(t *testing.T) {
// Start and immediate stop a server so we have a dead port.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nit-pick: "immediately".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: 4bc92c3

@otoolep
Copy link
Contributor

otoolep commented Mar 23, 2015

Generally makes sense, happy to play with this once it's merged. +1

Did you consider compressing as well as tarring? a tar.gz file would be handy. We might get decent compression.

This commit adds incremental backup support. Snapshotting from the server
now creates a full backup if one does not exist and creates numbered
incremental backups after that.

For example, if you ran:

  $ influxd backup /tmp/snapshot

Then you'll see a full snapshot in /tmp/snapshot. If you run the same
command again then an incremental snapshot will be created at
/tmp/snapshot.0. Running it again will create /tmp/snapshot.1.
pauldix added a commit that referenced this pull request Mar 25, 2015
@pauldix pauldix merged commit 5e47ed1 into master Mar 25, 2015
@toddboom toddboom deleted the backup-restore branch May 4, 2015 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wire up database backups
3 participants