If you want to identify the top AWS usage in your AWS accounts the AWS Cost and Usage reports (CUR) have much of the data required. You can query the CUR data through Amazon Athena.
This repository is an AWS Cloud Development Kit (CDK) template that bootstraps:
- AWS Glue catalog metadata in your account to query the CUR
- additional reference data you can use to enrich your queries - such as instance type information
- common queries as AWS Glue views, e.g. EC2 instance vCPU hours by account and month
You can modify the template to add your own data and queries.
This application is available in the AWS Serverless Application Repository. You can deploy it to your account from there:
The CDK template is written in TypeScript. TypeScript sources must be compiled to JavaScript initially and after each modification. Open a new terminal and keep this terminal open in the background if you like to change the source files. Change the directory to the one where cdk.json
is and execute:
npm install
npm run watch
Read the CDK developer guide for more information.
Create or update the application with cdk deploy
. You must provide parameters used when creating cost and usage reports. The CUR data must be stored in Apache Parquet format.
cdk deploy \
--parameters CurBucketName=<bucket name> \
--parameters ReportPathPrefix=<path without leading or trailing slash> \
--parameters ReportName=<report name> \
--databaseName=<optional databasename override, default: aws_usage_queries_database>
You can find the queries deployed via the Amazon Athena console. You can choose Preview next to each view or run a query like:
SELECT * FROM <see databaseName parameter above>."monthly_s3_storage_by_account" limit 10;
To avoid incurring charges, delete the AWS CloudFormation stack when you are finished experimenting via cdk destroy
in the directory where cdk.json
is:
cdk destroy
Or delete the AWS CloudFormation stack manually:
- Sign in to the AWS CloudFormation console and choose your stack.
- Choose Delete to delete all resources, including the bucket that stores the reference data.
Yes. You can use a bucket policy for cross-account access in Athena to Amazon S3 buckets or you setup cross-account replication of the report bucket.
Make sure your Amazon S3 bucket is granting the permissions for Cost and Usage Reports to PUT the CUR data into the bucket.
Standard AWS charges apply to the resources you deploy with this template. For the query cost read the Amazon Athena pricing and AWS Glue pricing for details. For the storage and requests to CUR data in your Amazon S3 bucket read the Amazon S3 Pricing. When creating the CUR report select Enable report data integration for Athena. That way the report data is written with the Apache Parquet columnar format which usually results in better performance and cost optimization for both Amazon S3 and Amazon Athena.
The instance types come from the EC2 API. Query and write it to a file:
echo "instance_family,instance_type,clock_speed_in_ghz,vcpu_count" \
> referenceData/instanceTypes/data.csv
aws ec2 describe-instance-types \
--output text \
--query 'InstanceTypes[*].[InstanceType, ProcessorInfo.SustainedClockSpeedInGhz, VCpuInfo.DefaultVCpus]' \
| sed -E 's/[[:blank:]]+/,/g' \
| sed -E 's/^(([^.]+).*)/\2,\1/' \
| sort >> referenceData/instanceTypes/data.csv
If you found yourself wishing this set of frequently asked questions had an answer for a particular problem, please submit a pull request. The chances are good that others will also benefit from having the answer listed here.
See the Contributing Guidelines for details.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.