Question

Issues while running htseq-count

0

Entering edit mode

12 months ago

Foad ▴ 10

My data is Candida glabrata and when i use htseq-count, no read is mapped to the gene_id.

Thank you for your time and help.

Foad

htseq-count GSNO_SRR1582646.sam Candida_glabrata_genome.gtf > GSNO_SRR1582646.count

10975 GFF lines processed.
8843 alignment record pairs processed.

head GSNO_SRR1582646.count 

gene-CAGL0A00165g   0
gene-CAGL0A00187g   0
gene-CAGL0A00209g   0
gene-CAGL0A00231g   0
gene-CAGL0A00253g   0
gene-CAGL0A00275g   0
gene-CAGL0A00297g   0
gene-CAGL0A00319g   0
gene-CAGL0A00341g   0
gene-CAGL0A00363g   0

Candida-glabrata RNA-seq htseq-count • 1.6k views

ADD COMMENT • link 10 months ago by Foad ▴ 10

0

Entering edit mode

Can you please show us the output to:

grep -m25 -v "^#" Candida_glabrata_genome.gtf

ADD REPLY • link 12 months ago by Ram 44k

0

Entering edit mode

Hi,

i want to send it to you, but i get this error.

**content

Language "nl" is not one of the supported languages ['en']!**

ADD REPLY • link 12 months ago by Foad ▴ 10

0

Entering edit mode

Please add the following line (as shown) below your post content. This should address the error above.

<a href="" title="Text added because biostars parser needs it"></a>

ADD REPLY • link 12 months ago by GenoMax 147k

0

Entering edit mode

Unfortunately, it is not possible.

ADD REPLY • link 12 months ago by Foad ▴ 10

0

Entering edit mode

What is not possible? This is a simple line you need to add when you respond. Copy and paste it as shown in editor window. Including the brackets and all.

ADD REPLY • link 12 months ago by GenoMax 147k

0

Entering edit mode

What I mean is that it gives the same error as befor.

ADD REPLY • link 12 months ago by Foad ▴ 10

0

Entering edit mode

That's odd. With 25 lines, there's enough content for the parser to stop complaining about language unless there's some language specific content in the 25 lines.

Please paste the 25 lines in a GitHub gist and paste the link to the gist in a comment (instructions here)

ADD REPLY • link 12 months ago by Ram 44k

0

Entering edit mode

Hi,

This is 25 lines GTF file:

Thanks

ADD REPLY • link 10 months ago by Foad ▴ 10

0

Entering edit mode

gene_id is the default feature used by htseq-count. So that leads to the possibility that your SAM file does not have reference name that matches the gff file. Reference should be NC_004691.1 in your reference sequence and annotation.

Can you show us output of grep @SQ GSNO_SRR1582646.sam?

ADD REPLY • link 10 months ago by GenoMax 147k

0

Entering edit mode

ADD REPLY • link 10 months ago by Foad ▴ 10

0

Entering edit mode

Side note: You don't need to use gists for such small content. Try pasting directly first, then use gist if the content is too large.

ADD REPLY • link 10 months ago by Ram 44k

0

Entering edit mode

I pasted it, but I encountered the previous error.

ADD REPLY • link 10 months ago by Foad ▴ 10

0

Entering edit mode

You might have encountered the same parser error as before. The line genomax gave you works to counter that error, please use it.

ADD REPLY • link 10 months ago by Ram 44k

0

Entering edit mode

As you can see NC_004691.1 was not in your reference. So perhaps the genes included there have 0 counts.

It appears that you are using Nakaseomyces galabratus genome which is found at: https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000002545.3/ You only seem to have 9 chromosomes in your reference though the genome has 14. It may be best to download the entire reference/annotation package from link above and re-do the alignments.

At a minimum, you should get the corresponding GTF/GFF file using the Download button that can be found at the link I pasted above. Then everything should match.

ADD REPLY • link 10 months ago by GenoMax 147k

0

Entering edit mode

My initial GTF file contains 14 references, but I ran the head command with grep and it showed ten lines. In addition, I downloaded the GTF file from the above link, but I encountered the previous problem.

ADD REPLY • link 10 months ago by Foad ▴ 10

0

Entering edit mode

I see.

VN:1.0 SO:unsorted

Your alignment file is currently unsorted. At this point only thing to try is to sort your SAM file by either name or position and indicate the correct sort order using the following option

-r <order>, --order=<order> For paired-end data, the alignment have to be sorted either by read name or by alignment position. If your data is not sorted, use the samtools sort function of samtools to sort it. Use this option, with name or pos for <order> to indicate how the input data has been sorted. The default is name.

ADD REPLY • link 10 months ago by GenoMax 147k

0

Entering edit mode

Hi,

At first, the sam file was converted to a bam file, and after that bam file was sorted by samtools. The last file was put on in htseq-count and also i used -r parameter for sort. Unfortunately, I encountered the previous problem, that is, the number of count is zero.

Thanks for your help and support

ADD REPLY • link 10 months ago by Foad ▴ 10