Dear all,
I am trying to use pysam.view in order to get the reads in a certain genomic position.
I use the following command :
pysam.view("-F0X100 -T "+ genome_fa, cram_file, region_to_query)
, where the variable genome_fa holds the path where to find the genome reference fasta along with the .fai index.
However, when I run the command, I have the following message and right after the reads in the region.
[W::find_file_url] Failed to open reference "https://www.ebi.ac.uk/ena/cram/md5/*": Protocol not supported
Why this line is print? Has someone came across this problem? How I avoid having this message?
If I run the same command but in the command line with samtools, I don't get this "error", so I don't understand what is the problem.
Thanks in advance,
That looks like an htslib error message.
At a guess, you have something very weird in your @SQ headers, like
M5:*
. It's meant to be an MD5sum, which is used as a key to download the reference sequence. If that's what is in there, you'll need to fix the header first. (I'd also be curious to know what software is outputting that.)Also the "protocol not supported" bit likely indicates this pysam was built using an old copy of htslib's vanilla Makefile which uses "knet" for http (and no https support), rather than the recommended "./configure; make" approach which uses curl.