Hi,
I did an alignment with STAR, I got the sam file which I can open easily from the terminal:
samtools view -S STAR2Aligned.out.sam
Now I have to work with it in python, so I import the pysam module and try to open the file by:
file = pysam.AlignmentFile("/home/lpp/Desktop/Star_Results/STAR2Aligned.out.sam", "r")
It prints the following error:
[W::sam_hdr_parse] duplicated sequence 'NODE_4_length_21_cov_1.000000'
[W::sam_hdr_parse] duplicated sequence 'NODE_18_length_23_cov_1.000000'
I tried to find the error in different forums but the most similar one to my problem has no answer: http://seqanswers.com/forums/showthread.php?t=58219
Does anybody now where this error is coming from?
Thanks in advance!
If you do a
then do you get more than one line?
It says:
Oh, there should be an
-S
flag given to samtools as well, mea culpa.thankss,
Now It says:
I should add that it's likely that you have multiple contigs with the same name in your reference genome. This will simply not work, though STAR won't complain (its output, however, will be broken unless the duplicately-named entries also have duplicate sequence...though even then the MAPQ values and such will be wrong).