Please help! I am trying to obtain fastq files from the GDSC, all we have in the lab is CRAM files. Unfortunately, the reference genome seems to not exist when pulled from an online source. I have attempted to use the Samtools code which allows me to put in my own reference but I get an error message
code used:
samtools view -b -T /GRCh37.p13.genome.fa -o /EGAR00001252191_13305_1.bam /EGAR00001252191_13305_1.cram
error message:
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/1b22b98cdeb4a9304cb5d48026a85128": Protocol not supported
[E::cram_get_ref] Failed to populate reference for id 0
[E::cram_decode_slice] Unable to fetch reference #0 10507..19965
[E::cram_next_slice] Failure to decode slice
[main_samview] truncated file.
Samtools v1.9, OS: Ubuntu 18.04.2 LTS
Any ideas? Thank you in advance!
I assume this data is coming from European Genome Phenome Archive? You should email datasharing at sanger.ac.uk to get assistance with reference.
Hello,
how have you installed samtools? Is htslib installed as well?
Related: https://github.com/samtools/samtools/issues/857
fin swimmer