Entering edit mode
7.1 years ago
sicat.paolo20
▴
30
I am currently having a problem with Identifying gene names. I have the gtf, gff, and sam files. I can identify the gene because I have information about their position and I could just search seqeunce however my list is over a thousand. Is there any way I can do this without doing it one by one manually?
it's not clear, from which file do you need to get the gene name ? why do you cite the BAM ? why do you search the "sequence" ?
at some point I lost my gene names and got tagged by something else. sequences still matched though.
it's kamoulox
Your GTF/GFF files should have the sequence names (if they are properly formatted). They are text files and you should be able to
less|more filename
to view contents. Post a few lines here by doinghead -5 gtf/gff_file
at some point I lost my gene names and got tagged by something else. sequences still matched though.
Not exactly sure what you are referring to. You should never have to modify gff/gtf files when you do any analysis.
use
grep
in linux/OS X.if you have gene symbols in a file