Question

Convert STAR sjdb to gtf

0

Entering edit mode

9.8 years ago

lkmklsmn ▴ 980

The RNA-seq aligner STAR can use annotation data to generate genomes. These data can be in gtf, gff3 or sjdb format. My question is, can I convert the sjdb file to gtf?

I would like to see what 'transcriptome' annotation was used to generate the genome.

Thanks

STAR RNAseq splice junctions gtf RNA-Seq • 3.6k views

ADD COMMENT • link updated 9.8 years ago by Devon Ryan 105k • written 9.8 years ago by lkmklsmn ▴ 980

score 1 · Answer 1 · 2015-02-18

A GTF file has information not present in the sjdb files, namely gene_id and transcript_id columns. The simplest route to proceed might be to just take the first 3 columns of the sjdb file and compare that to the introns in the various possible annotations. Note that the sjdb coordinates are 1-based. I suspect that a simple bedtools intersect -f 0.99 ... | wc -l or something like that would work.