Annotating genome based on Sequence
0
0
Entering edit mode
13 months ago
buhbs ▴ 30

Hi all, I was wondering if anyone knows of an R package to annotate genomes based on the sequence of features. For example, I would like to use a list of features with their corresponding sequences that I have made as a database and then query a genome fasta files for those features. I use snap gene now but I was hoping to automate my annotations using R. I appreciate any ideas/input. Cheers

genome annotation • 1.1k views
ADD COMMENT
0
Entering edit mode

I guess you meant:

map X sequences corresponding to features (gene, transcript, repeat etc.) to the genome, record positions as i.e. BED/GTF, then query it? But this would be run of the mill genome annotation.

But if you already extracted these feature sequences from the same genome (== you have the positions) then I am not sure what you intend to do.

ADD REPLY
0
Entering edit mode

I have extracted the features from a parent genome but I am looking for insertions/deletions/SNPs in genomes of daughter strains. So i want to use a feature database to annotate new genomes.

ADD REPLY
1
Entering edit mode

So you align these sequences of features to your new genome. Not other way around

ADD REPLY
0
Entering edit mode

I understand that I need to align them but I was asking if anyone knows any R packages to align and annotate features in a genome?

ADD REPLY
1
Entering edit mode

As far as I know there are no genome aligners written in R. So in any case you will need a standalone program to do it or a Galaxy server.

Aligner installations are probably easiest using conda:

ADD REPLY
1
Entering edit mode

You could always use something like Liftoff to get the corresponding annotations for your daughter genomes, then just map using the genome-specific annotations Liftoff GitHub page

ADD REPLY

Login before adding your answer.

Traffic: 2060 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6