choosing a suitable max intron size for STAR (plant alignment)
0
3
Entering edit mode
8.3 years ago
Biogeek ▴ 470

Hey guys,

Any tips? I know STAR aligner is optimised for mammalian alignments. I have a reference genome with a gff3 file for a plant and there are only details for exon, CDS, UTRs, but not introns. Additionally the genome is presented in scaffolds only, not chromosomes. The default STAR settings put max intron length to around 500,000 nt which is huge. Can anyone suggest a suitable maxintron value, or point me to literature on such a matter. It seems this goes unreported in alignment methodology most of the time.

Thanks.

intron size star plants • 5.4k views
ADD COMMENT
1
Entering edit mode

If you map some reads with BBMap, you can produce a histogram of indel lengths with the "indelhist" flag, and use that to inform your decision. The distribution varies by plant species.

bbmap.sh in=reads.fq ref=genome.fa maxindel=500000 indelhist=ihist.txt reads=1m

That will just map the first million reads and stop.

If you already have a mapped sam/bam file, you can alternatively generate the indel length histogram with Reformat:

reformat.sh in=mapped.sam indelhist=ihist.txt
ADD REPLY
0
Entering edit mode

Thanks Brian, seems like a handy little tool :-)

ADD REPLY
0
Entering edit mode

I'm assuming around 10,000 would be appropriate?

ADD REPLY
0
Entering edit mode

Hi, we have done RNA-seq analysis and to optimize parameters for plant genomes, minimum and maximum intron lengths were set as 60 and 6000 according to what was described for splicing in Arabidopsis (Márquez et al., 2010);

ADD REPLY

Login before adding your answer.

Traffic: 2015 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6