Hi Is it possible to extract the CDS (Coding sequences) from a aligned bam file or from a vcf file ? If I'm wrong, what is the best way to extract the CDS from a WGS dataset ?
I'm interesting in positive selection scan by comparing with different subgroups
Suggestions please.
If you have reference fasta and corresponding annotation file with CDS and vcf, you can use getfasta from bedtools suite, to get CDS sequence. You can also use bcftools consensus function to get sequence information using VCF. samtools or bamutils can help you in extracting regions of interest from bam.
Thanks for suggestions. I don't have annotation file with CDS for all genomes (except for ref genome) I started in this way ----- 1) I downloaded bam files of different subgroups and I called variants using GATK4 and generated vcfs 2) As of now, I had only one bed file for my reference genome with CDS coordinates then using bedtools I extracted the CDS for reference genome. 3) I don't have annotation files for other genomes how to proceed further analysis ??
"I need to extract the CDS from 22 subgroup genome's, I had only bam files of all these genomes"
My work 1) I'm trying to extract CDS from 22 different subgroup populations 2)Then, I'll perform MSA among these CDS 3) By subjecting MSA alignment file as an Input to PAML, I estimate the dN/dS ratio and construct a positive scan model.
suggestions please.
Does the bam file have to be converted into a fasta file first in order to use getfasta? Or will it create fasta files after accessing the bam files??
Hello sunnykevin97 ,
are you interested in:
In any case you need the coordinates of your CDS before you can start.
fin swimmer
Thanks for suggestions,
I'm looking for variants in CDS among (~22) different subgroups
1) I'm trying to extract CDS from different subgroup populations 2)Then, I'll perform MSA among these CDS 3) By subjecting MSA alignment file as an Input to PAML, I estimate the dN/dS ratio and construct a positive scan model.
whether, the approach I'm doing was correct ? or is their any other simplest way to do it ?