I'm trying to use samtools to view an indexed CRAM file which is stored on our private s3 bucket. I have the config and credential files
$ cat ~/.aws/config
[default] s3=
addressing_style=path
output=json
region=us-east-1
and
$ cat ~/.aws/credentials
[default]
aws_access_key_id=*****
aws_secret_access_key=*****
I can generate a presigned url with the boto3 python library and the following works:
$ url='http://s3.[endpoint]/brynjar-test/sample.cram?AWSAccessKeyId=****************&Signature=************&Expires=1561026577'
$ samtools view $url | less -S
I cannot use it to view a specific region (could this be done if samtools allowed to specify an index file instead of always appending .crai to the input file name and looking for that index file?)
I also tried the following:
$ samtools view http://s3.[endpoint]/brynjar-test/sample.cram
[E::hts_open_format] Failed to open file http://s3.[endpoint]/brynjar-test/sample.cram
samtools view: failed to open "http://s3.[endpoint]/brynjar-test/sample.cram" for reading: Permission denied
If the above worked I would assume given that the .crai file was stored in the same bucket that I could specify a region, although I can't verify this.
I tried setting the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY but that does not help. I have read through A: Tool for random access to indexed BAM files in S3? but that post is using a public S3 bucket.
Any idea how I can solve this?
You're getting a permission denied error - have you confirmed that your ID and key have the appropriate privileges for the bucket and/or whether there are any other restrictions placed on the object?
Can you see it if you use the AWS CLI (e.g.,
aws s3 ls ...
? Do you need to change the region for the call?Also, make sure you keep the AWSAccessKeyId, signature, etc... parameters after the '?' on your
samtools view
command, otherwise the (presigned) URL will not be accessible.The presigned URL works fine apart from not recognizing any index file.
I just noticed that samtools has added an option for specifying an index file but it has not been officially released yet, I think I'll try cloning it (Feat/support passing index files).
Thanks for the reply.
I will have to check better on these privileges. I thought the object file had appropriate ACL for the given credentials I am using, but perhaps not.
I can't use the AWS CLI as our research network is offline and there is probably not a mirror for it. I will have to ask our IT department.