Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make hfile_s3 refresh AWS credentials on expiry #1462

Merged
merged 1 commit into from
Jul 7, 2022

Conversation

daviesrob
Copy link
Member

This is to make HTSlib work better with AWS IAM credentials, which have a limited lifespan, and so may need to be refreshed. To allow this, hfile_s3 is made to look for an unofficial expiry_time entry in the AWS_SHARED_CREDENTIALS_FILE. If present, the file will be re-read if the current time is within one minute of the given expiry (new credentails are available five minutes before expiry, according to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html, and in practice much earlier).

Currently no effort is made to understand the JSON format emitted by the AWS security-credentials endpoint - mainly because there are several ways to get credentials, which all have subtle differences. Rather than try to support them all, it's left up to the end user to reformat the credentials into the style of the normal '.aws/credentials' file. An example of how this can be done for one source of credentials on AWS is added to the manual page.

Fixes a bug where parse_ini would append to rather than replace existing values.

Moves x-amz-security-token to the set of headers updated via callback, as it can now change when the credentials are updated.

Includes an implementation of the timegm() function, which is not portable (e.g. mingw doesn't have it) but needed to convert the expiry time to a time_t. This is put in a separate header so that it can be more easily reused elsewhere if we want. Includes tests to check that details like leap years and normalisation work properly.

This is to make HTSlib work better with AWS IAM credentials,
which have a limited lifespan, and so may need to be refreshed.
To allow this, hfile_s3 is made to look for an unofficial
'expiry_time' entry in the AWS_SHARED_CREDENTIALS_FILE.  If
present, the file will be re-read if the current time is within
one minute of the given expiry (new credentails are available
five minutes before expiry, according to
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html).

Currently no effort is made to understand the JSON format
emitted by the AWS security-credentials endpoint.  It's up
to the end user to reformat this into the style of the normal
'.aws/credentials' file.  An example of how this can be done
for one source of credentials on AWS is added to the manual
page.

Fixes bug where parse_ini would append to rather than replace
existing values.

Moves x-amz-security-token to the set of headers updated
via callback, as it can now change when the credentials
are updated.

Includes an implementation of the timegm() function, which
is not portable (e.g. mingw doesn't have it) but needed to convert
the expiry time to a time_t.  This is put in a separate header
so that it can be more easily reused elsewhere if we want.
Includes tests to check that details like leap years and
normalisation work properly.
@daviesrob
Copy link
Member Author

Notes on how to test this.

  1. Make an S3 bucket with a bit of test data in it (e.g. s3://example-bucket/). Make sure it's private.
  2. Head to the AWS Identity and Access Management (IAM) console and create a policy to allow access to the bucket:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:AbortMultipartUpload"
            ],
            "Resource": "arn:aws:s3:::example-bucket/*"
        }
    ]
}
  1. Create a role with AWS EC2 as the trusted entity and attach the policy to it.
  2. Head over to the EC2 console and prepare to launch an instance. In the "Advanced details" section, select the role made in step 3 for the "IAM Instance profile". Select "V2 only" for the "Metadata version".
  3. Launch the instance, log into it and install this branch of HTSlib.
  4. Use the example script in the revised manual page to get and refresh credentials giving access to the bucket.
  5. Point AWS_SHARED_CREDENTIALS_FILE at the credentials file the script makes, and then try using htsfile to copy data out of the bucket, e.g. htsfile -C s3://example-bucket/example_file local_file.

The credentials should refresh after three hours (or up to six if AWS is slow about replacing them).

A realistic test needs to open a file, and keep on using through the refresh. To do this, I wrote a simple program to open a file, then in a loop, hseek(hf, 0, SEEK_SET), read it and then wait a few minutes before reading again. After about six hours(!) the original credentials will expire and the plugin should try to read some new ones (as long as the script replace the old file in good time).

@whitwham whitwham merged commit b014804 into samtools:develop Jul 7, 2022
@daviesrob daviesrob deleted the aws_iam branch July 12, 2022 08:37
@@ -720,6 +726,7 @@ test/test-parse-reg.o: test/test-parse-reg.c config.h $(htslib_hts_h) $(htslib_s
test/test_realn.o: test/test_realn.c config.h $(htslib_hts_h) $(htslib_sam_h) $(htslib_faidx_h)
test/test-regidx.o: test/test-regidx.c config.h $(htslib_kstring_h) $(htslib_regidx_h) $(htslib_hts_defs_h) $(textutils_internal_h)
test/test_str2int.o: test/test_str2int.c config.h $(textutils_internal_h)
test/test_time_funcs.o: test/test_time_funcs.c $(htslib_time_funcs_h)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$(hts_time_funcs_h).

(See also autoconf on config.h, para 2.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes. Will fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1474

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants