Entering edit mode
2.4 years ago
ammasakshay
▴
60
I am having trouble with an alignment for an experiment involving Balb/c mice.
Specifically, in annotation during Feature Count.
This is the reference genome I am using for my sample: https://www.ncbi.nlm.nih.gov/assembly/GCA_001632525.1/ I am unable to find the annotation GTF file needed to proceed. How does one obtain this from the reference genome?
Thank you
No idea if this works. Try it out if you want and let us know: https://hackmd.io/@astrobiomike/conv-gb-to-gtf
There is a
gbff
file for Balb/c here: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/632/525/GCA_001632525.1_BALB_cJ_v1/GCA_001632525.1_BALB_cJ_v1_genomic.gbff.gzTried, this already, doesn't work.
The most up-to-date source is likely Ensembl Biomart - choosing the "Mouse strains" dataset. You can download all the required attributes as CSV/TSV, but will need to do a bit of perl/awk reshaping to get a gff file out of it. Alternatively, use the biomarRt/GenomicFeatures packages in R to output a gtf/gff.
If that sounds like too much hassle, you can export a few years old data from the Genome Browsers: There is a track hub for the UCSC Genome Browser and also an assembly in the Ensembl browser.. Both browsers can (usually) export gtf files.