Hi Guys
I am new to the RNA Seq world and just starting out with linux. I need to create a custom gene annotation file with a list of genes I am interested in analyzing. How do I do that.
Hi Guys
I am new to the RNA Seq world and just starting out with linux. I need to create a custom gene annotation file with a list of genes I am interested in analyzing. How do I do that.
If you are studying a well-annotated species, you can download a GTF or GFF file from Ensembl, NCBI, or UCSC. Then, you just filter the GTF/GFF file and get the lines related to your genes. That's done. You can also check the tophat website to see whether your species in on their list. If yes, you can choose one of the three sources of annotation. They provide a full set of information.
However, if you are studying a newly sequenced species, probably your should generate the annotation for those genes by yourself. You must conform to the GTF specifications. Currently, GTF2 and GFF3 are both popular.
You mentioned "a list of genes" you are interested in analyzing. The filtering process is to single out the genes you are interested in. Unfortunately, you are studying a non-model species. If the genome is sequenced, you can see if the genome annotation is provided. If the genome is not sequenced, you need to first assemble transcripts using Trinity etc. After obtaining the transcripts, you can annotate them by BLAST-ing those scripts against protein and RNA sequences from closely related species.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
What do you have to start with and how big is your list of genes?
You need to add little more information. Normally if you are working with a well-studies species like human and mouse, then you can download their gff file from Ensembl or UCSC. You can either use the full gff or subset of the gff file. If your organism doesn't have an annotated reference genome, then you can use Tuxedo suit tools for your RNA-seq analysis.
BTW, if you're just starting out and not doing something extremely simple, then you might be best off finding a local collaborator.