Hi,
Maybe my question is very basic but I just want to know how I will get the reads count with respect to geneID. I have RNA-seq data and I used htseq-count and got read count in terms of Ensembl Transcript ID.
Thanks
Hi,
Maybe my question is very basic but I just want to know how I will get the reads count with respect to geneID. I have RNA-seq data and I used htseq-count and got read count in terms of Ensembl Transcript ID.
Thanks
The default arguments for htseq-count will count by Ensembl gene ID. If you got read counts for Ensembl transcript IDs, I'm guessing your command has "-i transcript_id" in it. Remove that. If your GTF file doesn't use the attribute "gene_id" for gene IDs (GTFs downloaded from Ensembl should), you need to specify it with the -i argument, e.g. "-i geneID".
Have a look in the manual of htseq-count and pay especially attention to the -t
and -f
parameters. Then look in your GFF file and choose the feature tags that correspond to your question and use those when running htseq-count
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks. I got the point.
I have one more query. Now as I have Ensembl gene IDs and I got the gene symbols with each Ensembl gene IDs. Also, I found that some Ensembl gene IDs having the same gene symbol. In this case, if I want read count w.r.t gene symbol then what I have to consider.
Any help is much appriciated.
Thanks
This might answer your question: Why am I getting different ensembl gene ids for a given gene symbol?