-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to update from a circleci/postgres ram image? #2
Comments
This image does not use a RAM disk by default. When the image goes GA, this will be considered. One of the tasks right now though is to see what the performance difference is. On more modern hardware, some people have reported not much of a difference. I'd like to run a benchmark on CircleCI. ref: https://schinckel.net/2019/09/25/speeding-up-postgres-using-a-ram-disk/ |
couldn't we be able to specify a mount something like e.g. docker:
- image: cimg/postgres:13.5-postgis
options: >-
--mount type=tmpfs,destination=/var/lib/postgresql/data |
@bf4 It is not possible to set docker-level options like this in CircleCI. What you can do is set the On the topic of whether it is worth using a ramdisk, from some testing I have done today using pgbench, I have found very little difference in the latency when using the ramdisk for this image. Most of the results came back within a margin of error and in some instances using the ramdisk ended up marginally slower. If anyone has any other benchmarks to share, it would be helpful in determining whether it is worth producing a |
Hypothesis: When the database has a small amount of data, or if only a certain section of data gets called often, then the ramdisk would provide no benefit because the data would get cached in ram after it's pulled from storage. Where the ramdisk could be helpful is if you have a larger-size database and you're pulling more varied data. I'm guessing a lot of CI users (including us) fall into the first category. |
Well, our ci seemed to slow down when switching off the ram image... been
curious to see if I could compare. Will try the env trick and update this
comment
…On Tue, Dec 14, 2021, 11:47 AM Caleb Collins-Parks ***@***.***> wrote:
Hypothesis: When the database has a small amount of data, or if only a
certain section of data gets called often, then the ramdisk would provide
no benefit because the data would get cached in ram after it's pulled from
storage. Where the ramdisk could be helpful is if you have a larger-size
database and you're pulling more varied data. I'm guessing a lot of CI
users fall into the first category.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABC4QXHDUJ2B4ZANNQN5G3UQ57MRANCNFSM5DHLRO5A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
@caleb15 You raise a valid point. I went back and re-ran my pgbench tests with a higher scaling factor so it had a much larger database to deal with (around 1.6GB) and ran a 5 minute test. With ramdisk:
Without ramdisk:
Re-running these several times showed a range of 1.8-2.4ms latency average regardless of whether the ramdisk was used or not. Example config for this test:
@bf4 Seems the env var trick doesn't work without updating the docker entrypoint due to a hard coded path. I spun an example image here which you can test if you want. As above though, I cannot see any gains to using a ramdisk in this instance. |
So, we used to run - image: postgis/postgis:13-3.1-alpine
environment:
PGDATA: /dev/shm/pgdata/data shrug |
@BytesGuy Any thoughts here on using the PGDATA envar? If we think this is useful, perhaps we can close out this Issue by having Jeff adding it to the readme? |
We've determined no noticeable difference between versions right now to justify the extra variant. |
We switched to cimg/postgres. We haven't done any performance tests but no-one's complained about a slowdown. For others wondering about performance I would recommend trying upgrading to cimg and then you could use a custom docker image if that's too slow for you. |
@FelicianoTech I've been running builds with and without PGDATA on shm. The builds on SHM seem to run in 8-10 minutes and builds without SHM run in 9-12 minutes. 10-20% is not insignificant. Our build run a Django test suite that uses fixtures pretty extensively so large datasets get reloaded with each test case, and individual tests are wrapped in transactions and rolled back. Have you done benchmarks with similar load profiles where you may be pumping 300K-3M of data into the DB between each test with hundreds of tests? |
@FelicianoTech With our custom PostgreSQL 15.3 + PostGIS image with PGDATA placed in In our case, it's 50% longer, so we keep using our own custom PostgreSQL + PostGIS image instead of We would like to switch to |
I noticed there's no ram tags like in circleci/postgres. Does cimg-postgres use ram by default? If not, please consider this a feature request for images with ram tags.
The text was updated successfully, but these errors were encountered: