delete several ranges from a DNA sequence
1
0
Entering edit mode
3.7 years ago
oinkost • 0

Hi,

I have a probably very simple question, but I have not found a good solution yet.

We are comparing a number of bacterial genomes using a number of approaches, one of these is calculating the average nucleotide identity (ANI), this is used to determine if two bacteria belong to the same species or not.

We noticed that some of the genomes are very similar, but encodes a large number of strain specific mobile elements scattered all over the genomes.

We would like to delete these mobile elements from the DNA sequences and then perform the ANI analysis on the genomes stripped of the mobile elements.

We do have the sequences of the mobile elements as fasta files, we can also easily create lists with the location of these in the genomes i.e. start and end base. There are 50-100+ of these in each genome so manually finding and deleting them would be very time consuming. My question is then is there an easy way, I am sure there is, to remove a number of DNA segments from a genome?

I first thought about deleting a range from ZZZ to XXX, but then realised that the start stop base number wouldnt match after the first mobile element had been deleted, if I did use a loop. maybe this is so simple that there even exist some software for this?

Any hints would be appreciated thanks!

genome slicing delete segments • 555 views
ADD COMMENT
0
Entering edit mode
3.7 years ago
JC 13k

One trick I used before is to transform the regions to N's, in that way you conserve the same chromosome length.

ADD COMMENT

Login before adding your answer.

Traffic: 2554 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6