Randomising the content of a site in a nucleotide alignment without altering composition

1

Entering edit mode

6.7 years ago

treesandthings ▴ 20

Hi everyone,

I have a DNA sequence alignment in a .fasta format that I'd like to read in using R, choose one of a given set of variant sites (columns if we think of a DNA sequence alignment as a matrix where rows are sequences), and then randomise this site x times to produce x alignments that are identical with the exception of the 'shuffling' of the site of interest.

What is important here is that I'd like to keep the base composition of the given site. For example, if a given site has a 'C' frequency of 0.7 and a 'T' frequency of 0.3, I would like to retain this. All I want to do is shuffle which sequences have which nucleotide.

Does anyone know of a software package that can do this? Or alternatively of a quick way in R that I can isolate the colum of interest and simply rearrange its contents in a random way?

Thank you

R sequence • 1.4k views

ADD COMMENT • link 6.7 years ago by treesandthings ▴ 20

0

Entering edit mode

You might want to shuffle the columns or nucleotides using the Fisher-Yates Algorithm.

ADD REPLY • link 6.7 years ago by kloetzl ★ 1.1k

0

Entering edit mode

did you try fasta shuffle letters? (http://meme-suite.org/doc/fasta-shuffle-letters.html)

ADD REPLY • link 6.7 years ago by cpad0112 21k

Login before adding your answer.