Skip to content

Set appropriate cache-related headers for database dumps. #1916

Closed
@smarnach

Description

@smarnach

In #1800 we introduced database dumps that can be downloaded from https://static.crates.io/db-dump.tar.gz. The dumps are updated every 24 hours. However, CloudFront may cache them for up to 24 hours, so in the worst case users will see a new dump only shortly before the next dump is generated.

We can fix this by setting appropriate caching headers for the dump. Here are some ideas:

  • We could set an "expires" header to, say, 24.5 hours after the dump was created. This would give some wiggle room for different dump creation times, but it would ensure that the new dump will become available roughly half an hour after it was created. However, the dump frequency is configured in the Heroku scheduler, so if we decide to set a different frequency, we would need to remember to update the code as well, so we should at least introduce a command line parameter to enqueue-job if we decide to use this option. Another downside is that if a dump job fails, we will have a dump with an expiry in the past, so it won't be cached anymore.

  • It's probably possible to set the "etag" header together with a low TTL in the "cache-control" header. I believe this will result in CloudFront frequently asking S3 whether a version with a different etag is available, but it will only retransfer the dump if it has actually changed. This option has the advantage of being indepent of the dump frequency, but it needs further investigation whether things really work the way I seem to remember.

There may be other options as well – we can discuss this here on the issue.

Related: #1871, #1826, #1915

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-backend ⚙️C-enhancement ✨Category: Adding new behavior or a change to the way an existing feature works

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions