Skip to content

Latest commit

 

History

History
126 lines (64 loc) · 18.1 KB

postgres-in-docker-en.md

File metadata and controls

126 lines (64 loc) · 18.1 KB

Thou shalt not run a prod database inside a container

2019-01-13

For stateless application services, the container is a perfect development and operation solution. However, for a service with a persistent state - the database, things are not that simple. As a developer, I really like Docker and believe that Docker and Kubernetes are the standard way to deploy and deploy software for future development. While as a DBA, I think the database in the container is a nightmare for operation and maintenance. **Whether the database of the production environment should be placed in the container is still a controversial issue. But the truth is always more and more clarified. Today I will show you why it is a bad idea to put the production database into the container.

What problems does Docker solve?

Let's get start by looking at Docker's self-description:

Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure, so you can deliver software quickly. With Docker, you can manage your infrastructure in the same ways you manage your applications. By taking advantage of Docker’s methodologies for shipping, testing, and deploying code quickly, you can significantly reduce the delay between writing code and running it in production. „ The words Docker uses to describe itself include: lightweight, standardized, portable, cost savings, increased efficiency, automatic, integrated, and efficient operation and maintenance. So much benifits, and there is no problem claiming that., Docker does make development and operations much easier in most cases. So we can see that's why so many companies are so eager to Dockerize their software and services.

But sometimes this enthusiasm goes to the other extreme side: Containerize all software, EVEN A PRODUCTION DATABASE. Containers are originally designed for stateless applications, and temporary data generated by applications within a container is also considering part of that container. Create a service with a container and destroy it after you run out. That is the typical use cases.

These apps themselves have no state, and the state is usually stored in a database outside the container, which is the classic architecture and usage, and the philosophy of docker. But when we put the database inside the container, things are different. Database is stateful service, and in order to maintain this state regardless container runs and leave, the DB container needs to drill a hole to the underlying operating system, which is named data volume.

Such a container is no longer an object that can be created, destroyed, transported, transferred at will, but it is bound to the underlying environment, so many advantages of typical container use case no longer apply to database containers.

Reliability

Getting the software up & run and making the software run reliably are two different things.

Database is the core of almost all information systems. It is a CRITICAL service among whole IT systems. Here CRITICAL is literally explained: DEAD WHEN FAILURE HAPPENS. If application is down, you can pull it up later. But if your database is scraped and don't have a backup, then it is dead for good.

This is similiar to our daily software circumstance: Word/Excel/PPT can corrupt, and it is not a big deal to pulling them up. But if your critical document corrupted or lost, it is really a mess. Database is similiar for many companies: if the database is deleted and there is no backup, it may down. Reliability is the most important attribute of a database. Reliability (reliability) is the ability of the system to function properly in adversity (Hardware failure, software failure, human error) (completely functional correctly and achieve the desired level of performance). Note that reliability differs from availability. Reliability means fault tolerance AND Resilient. Availability can usually be measured by a serveral nines, a percentage representing the expectation ratio of the system uptime. Reliability is hard to measure, it can only be proved by continuously running, or falsify by failure. Therefore, reliability is a safety property and is not that intuitive or measurable as performance and maintainability.

Safty matters, many people tend to ignore the most important attribute —— safety. They only aware of that when people get killed, get robbed, get sickness, get car accident, get database dropped, etc.... Only after that, people would feel regret.

So, Docker's self-description does not contain words like "reliable", which is the most important attribute for database. Docker do know what it is capable of.

Additional failure points

When comparing with bare metal, Putting a database inside docker does not reduce the probability of hardware failures, software errors, or human errors. Instead, **the overall reliability of the system decreases ** due to additional components, additional complexity, and additional failure points. The database container needs to be bound to a specific machine via data-volume, so there hardware failure remains the same.

Standardized deployment may seem great by the first look. but there is no essential difference between scripts and dockerfile. At least for software bugs. It is mainly because of poor application design, which is a problem docker can not help with. So does human errors.

Additional components will introduce additional failure points, and the introduction of docker will not only involve us into problems with docker itself, but also the conjunct point between docker and other existing component. So, when failure occurs, it may be a problem with the database, a problem with the docker, a problem of the interaction between database & docker, or a problem because of interaction between docker & OS.

Take a look at the official PostgreSQL Docker image Issue list: Https://github.com/docker-library/postgres/issues. You can find a long list there. There are tons of additional problems when putting database into docker. Let me give you the simplest example: What will the database be if the dockerd daemon collapses? It will definitly break and out of service. Another subtle example is running two PostgreSQL instances on the same data directory, (2 docker on same volume, or 1 inside 1 outside). Your data will be toasted without proper fencing, and these are problems that never going to happen on bare metal.

Reliability Proof and community knowledge

As mentioned earlier, reliability does not a good way to measure. The only reliable is proved itself by continuously running correctly for a long time. Deploy database on bare metal is the traditional way of doing things, it has been proved by continuous work for serveral decades. Docker is a revolution to DevOps, but it is still too young , five years old is still much too short for critical things like production database. No enough lab rat yet.

In addition to long-term running, There is another way to "increase" reliability, which is failure. Failure is very valuable experience, it turns uncertainties into certainties, turns unknown problems into known problems. Failure experience is the most valueable part of operators. It is the form of operation knowledge, and it is the way for community to accumulate knowledge. For PostgreSQL, most of the community experience is based on the assumption of bare metal deployment, Variant failure has been explored by many people for decades. If you encounter some db problems. You are very likely to find similar situation other community members already been through, and find corresponding solutions. But if you search the same problem with additional keyword 'Docker', you would find a lot less useful information. Which means when something nasty happens, the probability of successfully rescuing the data is much lower, and the time required to resume would be much longer

Another subtle thing is, Companies and individuals are reluctant to share these failure experience if there are not special reasons. For companies, failure report may undermine company's reputation, it may expose sensitive information or may expose how rubbish the infrastructure is. For individuals, the fault experience is almost the most important part of their values. Once shared, their value undermined. Ops/DBA is not that open than Dev. That is the very reason why docker kubernetes operator exist: trying to make operation experience codify and able to accumulate. But that is really naive by now. Since few people would like to share these. You can find rubbish everywhere. Like the official PostgreSQL Docker image, it lacks tons of tooling & tunning & settings to work efficiently like a real-world database.

Tooling

Database requires lots of tools to maintain, including: operations scripts, deployments, backups, archives, Failover, Major/Minor version upgrades, plugin installation, connection pooling, performance analysis, monitoring, tuning, inspection, repairing, etc. Most of these tools are also designed for bare metal deployments. These tooling are critical too, without adequate testing, we can't really put that into use. Makes a thing up & running, and make things running steady for a long-time is complete different level of reliability.

Plugins is the typical example. PostgreSQL have lots of useful plugins, such as PostGIS. If you want to install the plugin to database, All you need is just typing yum install' and then create extension postgis on the bare metal. But doing it the Docker way, you need to modify the Dockerfile, build a new image, push it to the server, and then Restart the database container. No doubt that is much more complicated.

Similar problems including some CLI tools and system commands. They can preform on host in theory, but you can't assure the execution & result have exact same meanning. And when emergency situation happens, and you need some tools that doesn't include in container, and you don't have Internet access or yum repository. You would have to go through Dockerfile → Build Image → Restart Container. That is really insane.

When refer to monitoring, docker makes things harder. There are many subtle differences between monitoring in containers and monitoring on bare metal. For example, on bare metal, the sum of different modes of the CPU time and will always be 100%, but such assumptions do not always apply inside the container. In traditional bare metals, Node level metrics are important part of database indicators. it makes monitoring a lot worse when database container is mixing deployed with application. Of course, if you're using docker in a VM's manner, many things still likely to work, but in that way we will lose the real value of using Docker.

Scalability

Performance is another point that people concerned a lot. From the performance perspective, the basic principle of database deployment is: The close to hardware, The better it is. Additional isolation & abstraction layer is bad for database performance. More isolation means more overhead, even if it is just an additional memcpy in the kernel .

For performance-seeking scenarios, some databases choose to bypass the operating system's page management mechanism to operate the disk directly, while some databases may even use FPGA or GPU to speed up query processing. Docker as a lightweight container, performance suffers not much, and the impact to performance-insensitive scenarios may not be significant, but the extra abstract layer will definitely make performance worse than make it better.

Isolation

Docker provides process-level isolation. Database values isolation, but not this kind of isolation. Database performance is critical, so the typical deployment is take a whole physical machine exclusively. with some necessary tools in addition. there will be no other applications. Even when using docker, we'd give it a whole physical machine.

Therefore, the isolation provided by Docker is useful for multi-tenant oversold by Cloud database vendors. as for other cases, it does not make much sense for deploying database.

Scales out

For stateless applications, using containers makes scale out incredibly simple, and it doesn't matter which node you can schedule at will. While this doesn't apply to database or some stateful application, you can not create or destroy database instances freely as appserver: for example, to create a new replica, you have to pull it from primary whether you are using docker or not. It may take several hours to copy several TB data in production, and this still require manual intervention & inspection & verification. So what is the essence difference between running a ready-made make-replica script and running docker run . Time are spending on making new replicas.

Maintainability

Most of the software cost spending on operation phase rather than development phase: fixing vulnerabilities, keeping the system up and running, handling failures, version upgrades, migration, repaying tech debt, etc... Maintainability is very important for the quality of work & life of operators . That is the most pleasing part of Docker: Infrastructure as code. We can say that docker's greatest value lies in its ability to deposit operational experience of software into reusable code, accumulating it easily, rather than having a brunch of install/setup document & scripts scattered across everywhere. From this perspective, I think docker has done a great job, especially for stateless applications where logic is constantly changing. Docker and kubernetes allow us to easily deploy, complete expansion, shrinkage, release, rolling upgrades, and so on, so that Dev can also be able to work as an OPS, so that OPS can also be able to DBA life (plausible).

Can these conclusions be applied to database? Once initialized, database does not require frequent environment changes. It may continuously run years without big change. DBAs typically accumulate a lot of maintenance scripts, the one-key configuration environment isn't much slower than the Docker way, and the number of environments that need to be configured and initialized is relatively small, so the convenience of the container in terms of environmental configuration does not have a significant advantage. For daily operations, it is not possible for a database container to destroy creation and restart the migration as freely as the application container. Many operations need to be performed through the docker exec approach: In fact, they may still run the exact same script, but the steps has become much more cumbersome.

Docker prefer to say things like it is easy to upgrade software with docker. It is true for minor version: simply modifying the version in the Dockerfile and rebuild the image, then restarting the database container. However, when we need a major version upgrade, this is the way to do binary upgrade in docker: Https://github.com/tianon/docker-postgres-upgrade, and I can archive that in several lines of bash scripts.

It takes more effort to use some existing tools with docker exec. For example, docker exec will mix the stdin and stderr, Which makes a lot of tools that rely on pipe does not work anymore. For example, if you want to perform an ETL to transfer a table to another instance, in traditional way:

psql <src-url> -c 'COPY tbl TO STDOUT' |\
psql <dst-url> -c 'COPY tdb FROM STDIN'

with docker, things are more complicated

docker exec -it srcpg gosu postgres bash -c "psql -c \"COPY tbl TO STDOUT\" 2>/dev/null" |\ docker exec -i dstpg gosu postgres psql -c 'COPY tbl FROM STDIN;'

and if you want to take a basebackup from postgres inside container, and does not install PostgreSQL on host machine, you would have to run this command with a lot of extra wrapper:

docker exec -i postgres_pg_1 gosu postgres bash -c 'pg_basebackup -Xf -Ft -c fast -D - 2>/dev/null' | tar -xC /tmp/backup/basebackup

In fact, it is not Docker that elevates the daily operations experience, but the tools such as ansible. Containers may be faster in building a database environment, but such tasks are very rare. Thus, if the database container cannot be dispatched as freely as appserver, scales quickly, and does not bring more convenience to the initial setup, daily operations, and emergency troubleshooting than ordinary scripting, why should put the production database into docker?

I think maybe it's because a rough image solution would still be better than setup blindly without DBA. Container technology and orchestration technology is very valuable for operation and maintenance, it actually fills the gap between software and service. Its vision is to modularize the experience and ability of operation and maintenance. Docker & kubernetes would become the standard way of package management, and orchestration in the further. Evolve into something like "DataCenter DistributedCluster OperatingSystem", and become the underlying infrastructure of all software, became the universal runtime. After those major uncertainly been elimiated, we can then put our application & valuable database inside that. As for now, at least for the production database, it's just a good vision.

Summary

Finally, I must emphasized that the above discussion is only limited to the production database . That is to say, for db in development env, or application in production env, I am also very supportive of using docker. But when refer to production databases. if this data is really important, we should ask ourselves three questions before come into it:

  • Is this complexity worthy ?
  • Can I hold the problems related to it?

Any technical decision is some sort of trade-off, putting a production database into a container, the critical trade-off is **sacrificing reliability in exchange for maintainability **.

There are some scenarios where data reliability is not so important, or there are other considerations: for cloud service vendors, for example, it's a great scenario for putting database inside docker. Container isolation, high resource utilization, and management convenience fit all requirement in that scenario. As for most cases, reliability has the highest priority, sacrificing reliability in exchange for maintainability is not advisable.