Hello! 1 month ago i completed a transcriptome study. While making the normalization step, i used featurecounts. My featurecounts code was;
featureCounts -a Beta_vulgaris_ncbi.gtf -g transcript_id -o results.txt
The gtf file downloaded from NCBI database. I wanted the transcript_id but my result table column says gene_id. I did not realize that until yesterday. Now i am trying to make annotation with biomaRt and i am stuck. I dont know which filter i will use in biomaRt because i dont know if i have gene_id or transcript_id. Tried both filters and non of them worked. So how can i figure it out that what kind of ID numbers they are..
Hi Bane
rna-XM_XXXXXXXX are transcripts ID.
Thanks a lot. Now i feel pity that, why i did not think to look at the annotation file. Thank you a lot
Because you used
-g transcript_id
,featureCounts
used the transcript ID's fro annotation file in your results table. By default (if you had not provided-g
option) it would have usedgene_id
. You are also not summarizing the counts at gene level (-t exon
) so your counts are at the exon level.Featurecounts first coloum is always named geneid, it doesn't change to what was specified on the command line. However, when you look at the first line it preserves the command call parameters, out of experience one may trust what is specified there.
Oh thats a relief, thank you.
I do have another question now. Does the NCBI transcript and gene ids competible with ensembl or pytozome? Because when i try to convert my NCBI transcript_id to ensembl gene_id or anything that can give me a clue, the biomaRt shows no result. I manually search some of the NCBI transcript_ids in the annotation file that i downloaded from pytozome, and found no result.
How does the biomaRt work? I don't think biomaRt is simply making table of all kind of IDs and comparing the given data with its database.
XM*
ID's are predicted transcript ID's. They are not going to be directly translatable to Ensembl and phytozome. If you are interested in Ensembl ID's it may be best use Ensembl's version of the beet genome and corresponding annotation file. You can find that here.but i deleted all the earlier data. I just have the featurecounts results. should i make the whole normalization steps from the beginning?