-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abstract object storage interaction #35
Comments
Could we use the NFS container that you have in autopilotpattern/wordpress instead? There's an existing Manta-NFS that would be the production-ready drop-in replacement. |
autopilotpattern/nfsserver depends on
It may be possible that we could replace this with an RFD26 volume, but we should have a solid answer for how to use it on non-Triton environments. |
In autopilotpattern/nginx#42 we've moved the deployment of the application into an examples repo that can target different environments. When we implement that for this repo we'll definitely want this storage abstraction as well. |
This ticket includes mention of WebDAV and NFS. Between the two, NFS is easier in some ways, but WebDAV might be the better choice.
Thoughts? In terms of goals, I realize there are two things we might want to achieve here, and they might go well together:
|
(Hope you don't mind me quoting this publicly so we can discuss it openly @misterbisson ❤️)
We've been looking at refactoring the current I think adding WebDAV support makes a ton of sense, and should be similarly trivial to include. We were planning on a direct-filesystem implementation as well so that folks running Docker proper can use volumes, NFS, or bind-mounts to store/manage their backups however makes sense for their environment. If we then implement an S3 backend, I think we'll have pretty well covered the majority of the big "content storage" platforms (especially since most of them have either a direct filesystem interface or an S3 knock-off interface). So to respond directly to your question about creating a separate WebDAV server container which interacts with the storage platforms, I think that's an interesting way to try and force our implementations to stay DRY (and help us not duplicate effort). On the other hand, it will add an extra level of complexity and one more place that things could break down (not to mention that we'll be streaming all our backups over the network twice). Also, that will mean that each autopilot pattern will be implementing the Consul updating/locking in their own unique way, which we've found even in MySQL isn't currently done 100% consistently (there are small differences in the way different parts of the MySQL implementation update Consul that could lead to races in certain edge cases that using a standardized The main downside we see with what we've come up with so far (putting the logic in MySQL itself) is that synchronizing the implementations themselves then becomes an irritating problem. One solution to that which we've considered is actually making a PyPI module for some of this "common" autopilot pattern code (or at least having a common repository where they're all synced to/from that implements a simple dummy application for testing/verifying). |
@tianon thanks for moving the conversation here. I think you raise a number of important questions. One of the first ones I spotted was about what
This is definitely easier from an implementation perspective, but it's also something I'm increasingly identifying as an antipattern that we need to avoid in our cloud applications. Assuming the filesystem is reliable is the cause of a lot of breakage, and once you make that assumption it's really hard to work around it to recover your apps in time of failure.
I agree, that's definitely a cost. The recent S3 failure has also reminded me how important it is to design for failures. Doubling the network transfers also doubles the opportunity for recovery. That is, systems that depended on S3 failed, but systems that included some diversity didn't. Having a local copy of that data has huge value in those failure situations. (Yes, we are depending on network access to the local cache, but if that network is down we're also assuming the cloud availability zone is down.) Two network transfers gives us three copies of the data:
So, all-in-all, perhaps doubling the network interaction adds a lot of value?
You make a really good point about the challenges for locking and coordination, and that might be fair criticism. I get the idea you've just spent more time in the messy bits of code than I have recently, so I definitely don't want to argue with you. That feels like a problem that needs a solution, but I'm not sure how directly connected the solution to that is to making backups in the DB container pluggable. In fact, it feels as though pluggability there increases the complexity and surface area. Your suggestion to create a library is a good one, but is that enough? |
Good points -- yeah, I can see the value of having a separate sidekick WebDAV container as an additional point of backup. Honestly, I see a lot of value in implementing both the pluggability in MySQL and implementing the WebDAV sidekick, since I see valuable use cases for each (and either way, we need some way for MySQL to dump into WebDAV). When deploying to Triton specifically, dumping backups directly into Manta is going to be more stable/reliable than using a proxy WebDAV container, isn't it? It seems to me like it'd be a regression (from the perspective of a user of this MySQL implementation) to stray from that by essentially re-implementing a smaller version of Manta itself which does automatic backups to Manta as well. Would this WebDAV sidekick be responsible for pulling down the backups at launch too? Would it support multi-instance clustering similar to other autopilot implementations? Or would the backups on Manta at that point simply be a backup of the backup, and need to be restored into WebDAV manually to "reseed" a cluster, and WebDAV itself is the "source of truth" for which backups are available for cluster restoration/seeding? |
The backup files in the container in Here is part of the idea we had so far. I can push it up to a branch if you'd like a full diff of other changes for more concrete talking points. Part of the idea was to ensure that the download of a backup (and its extraction) or the creation of a backup gets cleaned up after it is used. This could probably be extended to keep X number of backups locally if that is desired. def put_backup(self, backup_func):
# TODO get lock
# self.consul...
# get any previous backup info from consul
previous_data = self.consul.get_snapshot_data()
# get time now so it is consistant throughout the backup process
backup_time = datetime.utcnow()
# backup_id generation from format string using time
backup_id = now.strftime('{}'.format(self.backup_id_fmt))
# make a working space that the db can use
# we'll clean it up at the end
workspace = self.__make_workspace()
# have the db make a backup
infile, extra_data = backup_func(workspace, backup_time, previous_data.get('extra'))
# database function can return None for the file is it doesn't need to do a backup right now
# ie, it will compare timestamps or binlog or whatever the db uses to determine need to backup
# can store anything like binlog name in the "extra_data" dict/map return value and it gets saved to consul; extra_data should probably be kept to a minimum
if infile:
# this is one of the functions that would then proxy to whichever backup solution
self._put_backup(infile, backup_id)
# store successful backup info in consul
extra_data = extra_data or {}
self.consul.record_snapshot({ 'id': backup_id, 'extra': extra_data})
# TODO release lock
# self.consul...
# clean up
self.__clean_workspace(workspace)
def get_backup(self, restore_func):
# TODO locking needed for restore?
current_data = self.self.consul.get_snapshot_data() or {}
backup_id = current_data.get('id')
if backup_id:
# make a space for backup storage to download the backup
workspace = self.__make_workspace()
# download the backup file
# this is one of the functions that would then proxy to whichever backup solution
datafile = self._get_backup(backup_id, workspace)
# make a new space for db so that it can extract if needed
db_workspace = self.__make_workspace()
# have the db restore from the given file
restore_func(datafile, db_workspace)
# clean up the workspaces
self.__clean_workspace(db_workspace)
self.__clean_workspace(workspace) |
@tianon wrote:
I agree that's the case, but it's a question of portability across deployment platforms and across blueprints. @yosifkit wrote:
They should be removed from the container after uploading. If not, yes that's a bug. |
@tianon wrote:
These are good questions, and I think you're slyly drawing out the complexity of attempting to run your own object store. Stale reads of the backup files are clearly a problem we'll need to avoid. My "simple" answer for that is to make the WebDAV container the source of truth. And that also means we'll probably need to run just one sidekick instance per thing-being-backed-up. For those WebDAV sidekicks that backup to Manta, I'd want to use https://github.com/bahamas10/node-manta-sync and create startup options that would support ingesting the contents of a directory as a preStart action. I might even want the ability to ingest the contents of one directory but back up to a different directory, or maybe do no backups at all after the download, as that would give me significant flexibility in bringing up dev/test environments. The same is obviously also possible with For production use on AWS, I'd probably back the sidekick container with an EBS volume and sync the contents off to S3 in a different region. On Triton I'd do much the same, but use an RFD26 NFS volume (not yet actually available) and Manta. I can imagine myself choosing to backup to object storage solutions from multiple providers, honestly, since storage is cheap and downtime is so expensive. /end uptime paranoia |
Ok, that's fair (big +1 to letting the WebDAV container be the source of truth for simplicity's sake). I can definitely see the appeal of a backup-storing sidekick, and agree that it's probably a good plan to move forward on that (so MySQL needs a way to get/put backups to a WebDAV endpoint for sure). I think the question that still remains is whether we want MySQL to still be able to backup directly into Manta (especially since it currently works that way). What's the backwards-compatibility story for users of this image? 😄 @yosifkit has a sane (IMO) design for being able to trivially support both Manta and WebDAV directly, and I think it'll help with clarity to abstract some of the "interact with a backup service" code from the "interact with MySQL" code (and more importantly, makes sure that existing users are still covered as-is), but we're happy to go either way. (For my own personal deployments, especially smaller ones, I'd really rather avoid the extra complexity of having a WebDAV container too, so I'd honestly like this whole backup bit to be completely optional as well, but I digress. 😇) |
I'm ready and willing to break backwards compatibility, but...
The complexity you're hoping to avoid is having the WebDAV at all, not whether or not the WebDAV container then backs up elsewhere? Is this in MySQL or Mongo? The difference is that MySQL won't work without a way to bootstrap replicas with a backup from somewhere, but if your interest is primarily in Mongo, then perhaps we can make the backup behavior optional? You can probably see that I'm trying to avoid having extra code paths, but I also need to check my assumptions about complexity here. Well, I need to avoid being too dogmatic, anyway. |
One thing that I've been really hand-wavey about is the behaviors of the WebDAV container.
There are probably other questions as well. |
Let me make sure I understand the two approaches I think we are discussing: option 1: (mysql -> webdav -> manta/S3/etc)
option 2: (mysql -> manta/S3/WebDAV/etc)
While writing this out, I can see that the two are not mutually exclusive; we can use option 2 to abstract away the tight manta integration while adding WebDAV support and also create a WebDAV container that can sync its local storage to another service. |
One of the most salient concerns I have is for being able to test the MySQL (or other DB) container thoroughly. We're doing a lot of work on that point now, and it's really highlighted how much more complex it would be if we add new code paths inside the DB container. Option 1 definitely satisfies that while also making it possible to have a separately tested container that can back up to many different locations. Option 2 is what I'm afraid of. I'll acknowledge that some of the complexity is just traded between the two options, but I believe that defining the interface between the DB and the backup narrowly as WebDAV does eliminate some complexity. It eliminates it to the degree that we can trust the DB will behave the same way regardless of what happens on the other side of the WebDAV container. |
Current Work-in-Progress is over in https://github.com/infosiftr/autopilotpattern-mysql/tree/generalize-backup. Hopefully we can have it running soon (@moghedrin has started helping as well) and should be able to reuse much of the additions for a WebDav service container that backs up to a configured service like Manta or even just local disk. The flow will be something like this: Mysql container (using the new backup lib) sends tarball to the WebDav container that saves the file locally and then sends the file to Manta or other service. The plan is to also have the ability to use Manta (with possibility to add other services) directly from the Mysql container but it will be provided without full integration testing. @misterbisson and @tgross, once we have a complete solution, I think it would make sense to move the new libbackup stuff to their own Does that sound amenable? Anything you want to change at this beginning stage? Any deadlines that you want/need to achieve? Note: lib-autopilot-common-py is just a random suggestion, bike-shedding welcome 😉 |
This seems like it doesn't get us much; the primary reason to abstract the storage component was to isolate the testing. If we keep the ability to use Manta directly from the MySQL container we still have to have all the full integration testing -- having untested components isn't going to fly.
The WebDAV server should certainly be in its own repo. I'm not so sure about making the backup component a library though; once you've pushed it into the WebDAV server it's a pretty simply client and having it as a library makes it harder to iterate on the design across multiple blueprints.
I find it slightly terrifying that we've been copying these components into databases that have completely different semantics for clustering and replication. 😀 Maybe we get the individual blueprints into a mature and production-ready shape before we try factoring out abstractions that might just prove to be the wrong abstractions? |
Given our goal to enable local development and portability, we might consider abstracting away the object storage interaction from
manage.py
inside the MySQL container.In an offline conversation previously, I'd proposed doing the MySQL backups to a container serving WebDAV and accessed via https://github.com/amnong/easywebdav (or some other non-filesystem client library). The WebDAV container could then own the responsibility of interacting with the object store.
This would work on a laptop without any internet connection, and in private clouds where there's no intention of sending the backups off-site nor of setting up a local object store.
The text was updated successfully, but these errors were encountered: