Can anyone tell if this HTSeq calculation come out correct
1
1
Entering edit mode
9.6 years ago

I am a new people in this field. After Bowtie and HTSeq, here is a output.sam file result (just list some here), want to ask your advice:

BD94_4129    0
BD94_4130    0
BD94_4131    0
BD94_4132    0
BD94_4133    0
BD94_4134    0
BD94_4135    0
BD94_4136    0
BD94_4137    0
BD94_4138    0
BD94_4139    0
BD94_4140    0
BD94_4141    0
BD94_4142    0
BD94_4143    0
__no_feature    6117090
__ambiguous    0
__too_low_aQual    0
__not_aligned    1813286
__alignment_not_unique    0

There are some may 0 above the _no_feature line, is it normal, or not suppose to be? Thank you for your advice again!

rna-seq • 1.4k views
ADD COMMENT
0
Entering edit mode

Anyone wants to comment on this and help me out. Thank you!

ADD REPLY
1
Entering edit mode
9.6 years ago

Without more information about the experiment it's a bit difficult to tell... A quick check to see if at least HTSeq has actually counted reads in genes is

awk 'substr($1, 1, 2) != "__" {if($2 == 0){nzero+=1}; nassigned+=$2}END{print NR, nzero, nassigned}' data.htseq

Where data.htseq is your output from htseq. This command will print three numbers: The number of genes, the number of genes with count zero, the total number of reads assigned to genes. It's probably ok to have a few genes without any read. The total number of reads assigned should be quite a bit larger than the number of reads in "__no_feature", unless your reference transcriptome is very incomplete. (Assuming what you have here is some sort of RNA-Seq experiment)

ADD COMMENT

Login before adding your answer.

Traffic: 808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6