I have some queries before running Braker
in ETP mode
, where I will provide both RNA Seq and protein data. I am looking forward to your valuable suggestions.
SRR1
,SRR2
,SRR3
,SRR4
respectively represent old male flower,
old female flower,
young male flower, and young female flower. I already haveSRA Tools
installed and will provide the SRA IDs of these RNA Seq datasets.I have downloaded the "
viridiplantae.fa
" fromOrthoDB
and would like to combine it with proteins from select species closely related to my plant.
Query 1: The Braker protocol states that coding sequence prediction quality improves if Braker
trains UTR
parameters for AUGUSTUS
, requiring stranded
RNA Seq alignment. Is UTR
training really important? If so, how can I verify if the RNA Seq libraries use a stranded
protocol?
Query2: For GeneMark-ETP
mode, Braker
uses Stringtie2
for assembly, requiring aligned reads with XS
tags. Since I have HISAT2
installed as an optional Braker
dependency, I should run it with the --dta
tag to include XS tags.
What is recommended here?
a. Run HISAT2
with the --dta
tag, generate BAM
files, and provide these to Braker
?
b. Run Braker
with unaligned RNA Seq data (will Braker use --dta
by default)?
========================================================================