Question About Htseq-Count
1
0
Entering edit mode
11.4 years ago

Hi all- I have been trying to figure this out for hours but not getting anywhere. Any suggestion is greatly appreciated!

I am trying to use htseq-count to get raw count from an alignment file, myAlignedFile.bam, output from gsnap.

Then I name-sorted the bam file, converted to sam, and run htseq-count:

samtools sort -n myAlignedFile.bam myAlignedFile_nsorted
samtools view myAlignedFile_nsorted.bam>myAlignedFile_nsorted.sam
python -m HTSeq.scripts.count -m intersection-strict myAlignedFile_nsorted.sam genes.gtf > htseq_count_result.txt

However, I got thousands of warnings like this:

Warning: Read HWI-ST845:130320:D20JBACXX:8:1101:6938:66256 claims to have an aligned mate which could not be found. (Is the SAM file
properly sorted?)

I was still able to get counts in the output file; however, I'm worried that I did something wrong and the result is not reliable. Any idea what I did wrong during these processes?

Many thanks in advance!

samtools rna-seq • 3.7k views
ADD COMMENT
1
Entering edit mode
11.4 years ago

You probably didn't do anything wrong (at least there's nothing obvious from what you posted). If you run "samtools view myAlignedFile.bam | grep HWI-ST845:130320:D20JBACXX:8:1101:6938:66256" you'll probably get an odd number of reads output, when you should get an even number. For a variety of reasons (usually multi-mapping or having only one reads of a pair map), some aligners don't always output both mates of a pair.

ADD COMMENT

Login before adding your answer.

Traffic: 1737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6