I am using following scripts to read fasta files
>library(Biostrings)
>dna <- readDNAStringSet("<<PATH TO FASTA FILE>>")
But, further I would like to extract SNPs from these alignment file, but I don't know how to extract the SNPs.
Does anyone know?
With adegenet package in R, fasta2DNAbin("text.fasta", snpOnly = T)
Thanks, I will try it
Hello Naung.M,
I used the script as you suggested from adegenet library
I found the following result: I have 109 sequences and each approximately 1169 length
Converting FASTA alignment into a DNAbin object...
Finding the size of a single genome...
genome size is: 1,169 nucleotides
( 60 lines per genome )
Importing sequences... .................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... Forming final object...
Extracting SNPs...
...done.
109 DNA sequences in binary format stored in a matrix.
All sequences of same length: 1058
Labels: Seq1 Seq2 Seq3 ...
Base composition: a c g t 0.284 0.202 0.253 0.261
(Total: 115.32 kb)
It's giving only above information, but I would like to extract the SNPs.
Then, I used
Showing this error.
Any one, please suggest me how to resolve this to extract SNPs from multiple fasta aligned file.
You can convert DNAbin object into csv files by the following script: write.csv(DNAbin, "filename.csv"). I guess allele will be coded in number format for each SNPs position.