Skip to content

Built-in data, intermediate container and permision problems #319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kachkaev opened this issue Jul 26, 2017 · 5 comments
Closed

Built-in data, intermediate container and permision problems #319

kachkaev opened this issue Jul 26, 2017 · 5 comments
Labels
question Usability question, not directly related to an error with the image

Comments

@kachkaev
Copy link

kachkaev commented Jul 26, 2017

I'm trying to build a lightweight docker container that starts as quickly as possible with all the data built-in. The goal is to form a scalable kubernetes service that provides access to some immutable data.

Looks like the solution is to use a multi-stage build that was introduced by Docker recently. The data dump is being read from a backup inside an intermediate container and then all the PGDATA directories are moved to the actual container. The resulting image does not have docker-entrypoint-initdb.d, which makes it two times smaller and also speeds up the start, because all the required folders are already in place.

Here is my Dockerfile:

FROM postgres:9.6-alpine AS data-donor
COPY /path/to/my/db/dump/on/host/ /docker-entrypoint-initdb.d/
ENV PGDATA=/pgdata
RUN docker-entrypoint.sh --help

FROM postgres:9.6-alpine
ENV PGDATA=/pgdata
COPY --from=data-donor /pgdata /pgdata
  • I'm adding --help to RUN docker-entrypoint.sh --help in the first stage just to trick the script – it otherwise does not start at all or launches a foreground process, which never exits.

  • PGDATA can't be a equal to its default value (/var/lib/postgresql/data), because that's a VOLUME and so things don't copy between the build stages.

The resulting lightweight image does start and does serve the data, but fails after some time. The symptoms are similar to what's described here: https://forums.docker.com/t/beta-9-postgres-stat-files-corrupted-when-data-stored-on-host-mapped-volume/10819

I tried fixing things by chown -R postgres:postgres /pgdata and chomd -R 777 /pgdata after everything else, but this did not help. The logs are still full of messages like these

 2017-07-26T07:56:04.004328315Z LOG:  using stale statistics instead of current ones because stats collector is not responding
2017-07-26T07:56:04.00448246Z WARNING:  could not open statistics file "pg_stat_tmp/global.stat": Permission denied
2017-07-26T07:56:07.24726432Z LOG:  incomplete startup packet
2017-07-26T07:56:12.247888609Z LOG:  incomplete startup packet
2017-07-26T07:56:17.247440723Z LOG:  incomplete startup packet
2017-07-26T07:56:17.248236006Z LOG:  incomplete startup packet
2017-07-26T07:56:22.24818477Z LOG:  incomplete startup packet
2017-07-26T07:56:27.247806065Z LOG:  incomplete startup packet
2017-07-26T07:56:27.891670873Z WARNING:  could not create relation-cache initialization file "global/pg_internal.init.9282": Permission denied
2017-07-26T07:56:27.89173002Z DETAIL:  Continuing anyway, but there's something wrong.
2017-07-26T07:56:27.892495964Z WARNING:  could not create relation-cache initialization file "base/12404/pg_internal.init.9282": Permission denied
2017-07-26T07:56:27.892530176Z DETAIL:  Continuing anyway, but there's something wrong.
2017-07-26T07:56:27.893971225Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.894134143Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.894337278Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.894752415Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.925004824Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.925026913Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.925224732Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.925444104Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.925568544Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.926003844Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.926257893Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:27.926475139Z LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": Permission denied
2017-07-26T07:56:32.247021854Z LOG:  incomplete startup packet
2017-07-26T07:56:37.251664911Z LOG:  incomplete startup packet
2017-07-26T07:56:37.253645873Z LOG:  incomplete startup packet
2017-07-26T07:56:42.247354375Z LOG:  incomplete startup packet 

What could this odd behaviour caused by?

@kachkaev
Copy link
Author

kachkaev commented Sep 3, 2017

@yosifkit @tianon could you please give a hint on what's going on?

@yosifkit
Copy link
Member

yosifkit commented Sep 5, 2017

My guess is that the graph driver is too slow (since that is where the database files are stored in your image).

From https://stackoverflow.com/a/32193782:

To keep this file up to date, the stats collector process periodically re-writes it with updated information. It typically does this several times a second. The process is as follows:

  1. Create a new file global.tmp
  2. Write data to this file
  3. Rename global.tmp as global.stat, overwriting the previous global.stat

If it is the graph driver, then it might be that it is getting the "create new file" for global.tmp and does so, but when PostgreSQL asks to write to it, it does not exist yet and so gets permission denied. This seems similar to the problem that ruby was having in docker-library/ruby#55.

@kachkaev
Copy link
Author

kachkaev commented Sep 8, 2017

Thanks for sharing your thoughts @yosifkit. This can explain the behaviour I'm seeing, although I'm not sure how to permanently fix it. Not an expert neither in docker nor in postgres, just a user of the magic of both :–)

The buggy container that I have runs in the k8s environment and it looks like I was able to find a workaround. It was in setting livenessProbe in the deployment:

livenessProbe:
  exec:
    command:
      - /bin/sh
      - -c
      - psql -U postgres -c 'SELECT version()' || false
  initialDelaySeconds: 20
  periodSeconds: 20

When the container gets stuck, SELECT version() stops working and k8s re-creates everything from scratch. This trick only works because the data I serve is immutable and is embedded into the container.

Hope this helps at least one person in a similar situation. Also hope to to see some suggestions on healtcheck e.g. as being discussed in #282. A permanent solution to the global.tmp issue is would be great too!

@wglambert wglambert added the question Usability question, not directly related to an error with the image label Apr 25, 2018
@tianon
Copy link
Member

tianon commented Jun 8, 2018

This one's a bit old (so I wouldn't be surprised if you found a different solution in the meantime), but I'd recommend still using a VOLUME, and simply adjusting your derivative image to do a full copy of the initial seed data into the volume at container startup before passing off to postgres itself -- that way you get the preseeded data without going through the full initdb process, and you avoid issues related to the Docker graph drivers.

Since this is a bit of an esoteric use case (and the issues appear to be more use-related than image-related), I'm going to close. 👍

@lferro9000
Copy link

I was getting similar errors to this in docker-compose:

db_1    | 2021-02-05 04:54:57.957 UTC [259] WARNING:  could not open statistics file "pg_stat_tmp/global.stat": Operation not permitted

I tried an experiment:

docker ps
docker exec -it <imagename/uuid> /bin/bash
rm -rf /var/lib/postgresql/data/pg_stat_tmp/global.stat

After this, the file started to get refreshed every 30 seconds without any more errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Usability question, not directly related to an error with the image
Projects
None yet
Development

No branches or pull requests

5 participants