Skip to content

EasyPost/s3-log-downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project implements a script to download S3 access logs quickly.

AWS S3 writes access logs into many, many small chunks (several per second), all of which are placed in a single S3 prefix in a different bucket than the bucket being logged. Most S3 utilities have a lot of trouble handling millions of small files.

This utility takes as arguments the bucket in which logs are stored and a prefix, and downloads all files matching the S3 log naming convention under that prefix. It then merges all segments on a per-day basis and writes out files. No intra-day sorting is performed.

For example, if invoked as s3-log-downloader --bucket our-logs --prefix s3-logs/otherbucket/2020-01 --output-dir output/, you should expect to see files named output/2020-01-01, output/2020-01-02, and so on, each of which will contain that days' logs.

At this time, no retries are performed and error handling is... not great.

This project is made available under the terms of the ISC License, a copy of which can be found at LICENSE.txt

About

Rapidly download AWS S3 access logs

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages