aligning scaffolds to the reference genome
1
0
Entering edit mode
7.9 years ago
arta ▴ 670

Hi all,

I had chloroplast reads and assembled them using Abyss and got chloro-scaffolding.fa files which consists of many scaffolds. I have reference chloroplast genome as well. I would like to align scaffoldings into reference genome and annotate the genes. Do you recommend me tools to do that or workflow how to do? Currently i am working with exonerate however it seems i can not annotate genes. Also the output of exonerate is not clear to me, i do not know how to do them for the downstream analysis.

Assembly scaffoldings alignment annotation • 2.5k views
ADD COMMENT
1
Entering edit mode
7.9 years ago
apa@stowers ▴ 610

It sounds like you are assembling mRNA-seq reads into transcripts and trying to align these to produce gene models?

If so, I typically would run exonerate like: "exonerate -m est2genome --revcomp --bestn 1 --showcigar --showtargetgff -t chloroplast.fa -q scaffolds.fa > scaffolds.out 2> scaffolds.err", then extract the GFF lines from scaffolds.out. However, this will not generate CDS features in the GFF.

If CDS annotations are important, then first run ORF prediction on your sequences and produce a 4-column, space-delim file ("CDS.txt"), one row per scaffold, containing these 4 values: scaffold ID, strand of ORF (+ or -), scaffold ORF start (1-based), scaffold ORF end. With this file, you can run exonerate like this: "exonerate -m cdna2genome --annotation CDS.txt --revcomp --bestn 1 --showcigar --showtargetgff -t chloroplast.fa -q scaffolds.fa > scaffolds.out 2> scaffolds.err". The GFF will now include CDS lines.

Depending on the results, you may want to change default values for --refine, --minintron, --maxintron. Exonerate is a parameter jungle so you may find other useful ones, but these are what I typically use.

ADD COMMENT
0
Entering edit mode

Thank you !! That will help a lot.

ADD REPLY

Login before adding your answer.

Traffic: 1603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6