Entering edit mode
12.1 years ago
jackuser1979
▴
890
I have RNA seq illumina paired-end reads mapped to reference genome using bowtie and have bam file from samtools. I am interested in calculating Tajima's D test using DNAsp. As DNAsp software accepts FASTA file, I have converted bam file into fasta format.
samtools view filename.bam | \awk '{OFS="\t"; print ">"$1"\n"$10}' - > filename.fas
I have Imported this converted fasta file into DNAsp for Neutrality-based statistical test. Does this conversion and doing calculation is correct way of doing or I should do multiple alignment using clustal to get fasta file?
I don't think you can compute a TD with just the reads, you need to generate a modified sequence with your variants and then compute the TD comparing with your original reference.
Why not? If the reads are aligned to the reference using bowtie & samtools, we get aligned file in bam format just like clustal alignment.
Yes, it seems to me that DNAsp needs aligned complete sequences to perform the TD test... check the manual. BTW, the problem will be the same when using tools like PAML...
I'm just wondering how to generate the modified sequence with the variant from NGS dat, I mean VCF/AM/reference mapping file. Any idea ?