Hi,
I work on plant species. I would like to get flanking sequences of SNPs from the reference genome and the different bam files.
Any help would be appreciated.
Thanks
Hi,
I work on plant species. I would like to get flanking sequences of SNPs from the reference genome and the different bam files.
Any help would be appreciated.
Thanks
How can I get snp flanking sequence from the genome file?
Thanks
Do you have a VCF file listing all the SNPs in the reference? If so, you can use the SNiPlay application: http://sniplay.southgreen.fr/cgi-bin/analysis_v3.cgi Load your VCF file and select the plant genome used as reference. And then export flanking sequences of SNPs...
You could use the FastaVariant
class in pyfaidx:
To get read pairs surrounding a SNP site from a BAM, you might try VariantBam. You can input any number of sites, and add a padding to the SNP if you like: https://github.com/jwalabroad/VariantBam
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
So you have bam files and a reference genome, and a list of snps (their positions), right? Or do you still need to generate the SNP calls based on the bam files? Do you want the flanking sequences as in the reference genome or as sequenced for your bam files?
Thanks for your reply. I do have SNPs and their positions. I would like to get the flanking sequences in reference as well as in bams.
Thanks
How can I get snp flanking sequence from the genome file?
Thanks
I'm not sure it makes sense to get the flanking sequence from the BAM files. This would be either the flanking sequence from all the reads at all the variant positions, or a consensus variant call which you presumably have in your VCF/FASTA combination.