-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Store file with a Time To Live
SeaweedFS is a key~file store, and files can optionally expire with a Time To Live (TTL).
Assume we want to store a file with TTL of 3 minutes.
First, ask the master to assign a file id to a volume with a 3-minute TTL:
curl http://localhost:9333/dir/assign?ttl=3m
{"count":1,"fid":"5,01637037d6","url":"127.0.0.1:8080","publicUrl":"localhost:8080"}
Secondly, use the file id to store on the volume server
curl -F "file=@x.go" http://127.0.0.1:8080/5,01637037d6?ttl=3m
After writing, the file content will be returned as usual if read before the TTL expiry. But if read after the TTL expiry, the file will be reported as missing and return the http response status as not found.
For next writes with ttl=3m, the same set of volumes with ttl=3m will be used until:
- the ttl=3m volumes are full. If so, new volumes will be created.
- there are no write activities for 3 minutes. If so, these volumes will be stopped and deleted.
As you may have noticed, the "ttl=3m" is used twice! One for assigning file id, and one for uploading the actual file. The first one is for master to pick a matching volume, while the second one is written together with the file.
These two TTL values are not required to be the same. As long as the volume TTL is larger than file TTL, it should be OK.
This gives some flexibility to fine-tune the file TTL, while reducing the number of volume TTL variations, which simplifies managing the TTL volumes.
The TTL is in the format of one integer number followed by one unit. The unit can be 'm', 'h', 'd', 'w', 'M', 'y'.
Supported TTL format examples:
- 3m: 3 minutes
- 4h: 4 hours
- 5d: 5 days
- 6w: 6 weeks
- 7M: 7 months
- 8y: 8 years
TTL seems easy to implement since we just need to report the file as missing if the time is over the TTL. However, the real difficulty is to efficiently reclaim disk space from expired files, similar to JVM memory garbage collection, which is a sophisticated piece of work with many man-years of effort.
Memcached also supports TTL. It gets around this problem by putting entries into fix-sized slabs. If one slab is expired, no work is required and the slab can be overwritten right away. However, this fix-sized slab approach is not applicable to files since the file contents rarely fit in slabs exactly.
SeaweedFS efficiently resolves this disk space garbage collection problem with great simplicity. One of key differences from "normal" implementation is that the TTL is associated with the volume, together with each specific file.
During the file id assigning step, the file id will be assigned to a volume with matching TTL. The volumes are checked periodically (every 5~10 seconds by default). If the latest expiration time has been reached, all the files in the whole volume will be all expired, and the volume can be safely deleted.
- When assigning file key, the master would pick one TTL volume with matching TTL. If such volumes do not exist, create a few.
- Volume servers will write the file with expiration time. When serving file, if the file is expired, the file will be reported as not found.
- Volume servers will track each volume's largest expiration time, and stop reporting the expired volumes to the master server.
- Master server will think the previously existed volumes are dead, and stop assigning write requests to them.
- After about 10% of the TTL time, or at most 10 minutes, the volume servers will delete the expired volume.
For deploying to production, the TTL volume maximum size should be taken into consideration. If the writes are frequent, the TTL volume will grow to the max volume size. So when the disk space is not ample enough, it's better to reduce the maximum volume size.
It's recommended not to mix the TTL volumes and non TTL volumes in the same cluster. This is because the volume maximum size, default to 30GB, is configured on the volume master at the cluster level.
We could implement the configuration for max volume size for each TTL. However, it could get fairly verbose. Maybe later if it is strongly desired.
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- Server Startup Setup
- Environment Variables
- Filer Setup
- Directories and Files
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
- Amazon S3 API
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up