Move And Renumber The Origin Of Circular Genome
1
0
Entering edit mode
11.0 years ago
ivpz ▴ 20

Hi all,

After doing a de novo assembly of a circular genome and comparing the linear contig to a closely related circular genome, I realise that the origin on the contig is actually located at a position about 30Kbp from the bp 1. I wonder what may be the best way to move that 30 kbp to a correct location, renumber the contig?

Thank you.

• 2.3k views
ADD COMMENT
1
Entering edit mode
11.0 years ago

It all depends what type of data you need to alter.

usually one just shifts the coordinates and applies the boundary condition.

ADD COMMENT
0
Entering edit mode

To clarify, I need to "cut and paste" the 30 Kbp 5'end to the 3'end. I was hoping that there may be a more efficient way to do this.

ADD REPLY
0
Entering edit mode

if all you have is a sequence then just move the sequence around, you can also do that with tools of various kinds if you wanted to use numerical coordinates, like so:

# create the index
$ samtools faidx chrI.fa

# select a region 
$ samtools faidx chrI.fa chrI:1-10
>chrI:1-10
CCACACCACA

# select another region
$ samtools faidx chrI.fa chrI:100-110
>chrI:100-110
GGCCAACCTGT

now you could just place these into a file and concatenate like so:

$ samtools faidx chrI.fa chrI:1-10 | grep -v ">" > a
$ samtools faidx chrI.fa chrI:100-110 > b
$ cat a >> b

by the end of the operation file b will contain the sequences from 100-110 followed by 1-10

ADD REPLY
0
Entering edit mode

I've done something similar using extractseq from emboss. I was thinking more of a simple one liner :)

Anyway, thanks for the suggestion. I guess samtools may work faster.

ADD REPLY

Login before adding your answer.

Traffic: 2241 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6