A powerful and easy-to-use GitHub Action that executes AWS Athena queries and manages the results seamlessly within your CI/CD workflows. Perfect for data validation, ETL processes, and automated reporting.
- π Execute Athena Queries: Run SQL queries on AWS Athena using query strings or pre-saved queries
- π Result Management: Automatically handle query results with customizable file naming
- π Size Validation: Ensure query results meet minimum size requirements before proceeding
- π₯ Download Results: Optionally download query results to the GitHub runner for further processing
- βοΈ Workgroup Support: Execute queries in specific Athena workgroups to separate applications
- π§Ή Cleanup: Automatic cleanup when result file size is smaller than the specified minimum size
- Automate Data Workflows: Integrate data queries into your CI/CD pipelines
- Validate Data Quality: Run validation queries as part of your deployment process
- Generate Reports: Create automated reports from your data lake
- ETL Process Integration: Seamlessly incorporate Athena queries into your data processing workflows
- Cost Optimization: Leverage Athena's serverless architecture for cost-effective data processing
All dependencies are pre-installed in the ubuntu-* runner images.
We recommend using GitHub's OIDC provider to authenticate with AWS before running this action.
name: Run Athena Query
on:
workflow_dispatch:
schedule:
- cron: '0 6 * * *'
jobs:
athena:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v5
with:
role-to-assume: arn:aws:iam::1234567890:role/GitHubActions
aws-region: eu-central-1
- name: Run Athena Query
uses: idealo/aws-athena-query-action@v1
with:
query-context: Catalog=AwsDataCatalog,Database=myDatabase
query-string: |
SELECT * FROM my_table
WHERE created = CURRENT_DATE
output-location: s3://my-bucket/query-results
output-filename: results.csv- name: Run Simple Query
uses: idealo/aws-athena-query-action@v1
with:
query-context: Catalog=AwsDataCatalog,Database=analytics
query-string: SELECT * FROM user_events LIMIT 100
output-location: s3://analytics-results/user-events
output-filename: user_events_sample.csv- name: Execute Saved Query
uses: idealo/aws-athena-query-action@v1
with:
query-id: 1c345f78-1c34-1a34-1234-123ddd89012
query-context: Catalog=AwsDataCatalog,Database=reporting
output-location: s3://reports-bucket/monthly
output-filename: monthly_report.csv
output-min-size: 10M- name: Run Query in Custom Workgroup
uses: idealo/aws-athena-query-action@v1
with:
query-context: Catalog=AwsDataCatalog,Database=analytics
query-workgroup: custom-workgroup
query-string: |
SELECT customer_id, SUM(order_amount) as total_spent
FROM orders
WHERE order_date >= DATE_SUB(CURRENT_DATE, INTERVAL 30 DAY)
GROUP BY customer_id
ORDER BY total_spent DESC
output-location: s3://analytics-results/customer-analysis
output-filename: top_customers_30d.csv
output-min-size: 5MSee action.yml for complete details.
| Name | Description | Required | Default |
|---|---|---|---|
query-id |
The unique identifier of the saved query. | No | - |
query-string |
The SQL query statements to be executed. | No | - |
query-context |
The context within which the query executes. | Yes | - |
query-workgroup |
The workgroup to use for the query. | No | primary |
output-location |
The location in Amazon S3 where your query results are stored. | Yes | - |
output-filename |
The desired name of the file where the query results are stored. | Yes | - |
output-min-size |
The minimum size of the output file (e.g., 1K, 1M, 1G). |
No | - |
download-location |
The location on the runner machine where the query results are downloaded. | No | - |
download-filename |
The desired name of the file where the query results are downloaded. | No | - |
Note: Either query-id or query-string must be provided. If query-id is provided, the query-string will be overridden with the SQL statements from the saved query.
| Name | Description |
|---|---|
query-id |
The unique identifier of the query execution. |
output-location |
The location in Amazon S3 where your query results are stored. |
The query-context parameter should follow this format:
Catalog=<catalog-name>,Database=<database-name>
Example:
query-context: Catalog=AwsDataCatalog,Database=my_analytics_dbThe action validates that query results meet the minimum size requirement specified in output-min-size.
Supported formats:
1024(bytes)1K(1 kilobyte)1M(1 megabyte)1G(1 gigabyte)
- Results are automatically moved to your specified filename in S3
- Metadata files are cleaned up automatically
- Failed queries with insufficient output size are cleaned up to prevent storage costs
- Use OIDC Authentication: Prefer GitHub's OIDC provider over long-lived access keys
- Least Privilege: Grant only necessary permissions to your IAM role
- Secure S3 Buckets: Ensure your output S3 buckets have appropriate access controls
- Query Validation: Review SQL queries for potential security issues before execution
Your IAM role needs the following permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"athena:StartQueryExecution",
"athena:GetQueryExecution",
"athena:GetNamedQuery",
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:GetBucketLocation"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"glue:GetDatabase",
"glue:GetTable",
"glue:GetPartitions"
],
"Resource": "*"
}
]
}Query Execution Timeout
- Consider optimizing your SQL query for better performance
- Partition your data to improve query performance
Insufficient Output Size
- The action fails if query results are smaller than
output-min-size(human-readable format) - Adjust the minimum size requirement or remove it from your configuration to skip this check
S3 Access Issues
- Verify IAM permissions for S3 bucket access
- Ensure the S3 bucket exists and is in the correct region
Authentication Errors
- Confirm AWS credentials are properly configured
- Check that your IAM role has the required Athena and S3 permissions
We welcome contributions! Please feel free to submit issues and enhancement requests.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- π Documentation
- π Issue Tracker
- π¬ Discussions
This action is developed and maintained by idealo, one of Europe's leading price comparison platforms. We're committed to open source and building tools that help developers work more efficiently with cloud technologies.
This project is licensed under the MIT License, see the LICENSE file for more information.
β Found this action helpful? Give it a star and share it with your team!