I found this post but I'm still not sure how to do this: vcf from sorted bam
Here's my genome file and index:
../../db/genomes/human/GRCh38.p13/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
../../db/genomes/human/GRCh38.p13/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.fai
Here's my bam file:
201858321_S1.trimmomatic.GCA_000001405.15_GRCh38_no_alt_analysis_set.sorted.bam
Here's my SNPs file:
hg38.snp147.bed
How can I generate a VCF file that uses annotations from the hg38.snp147.bed and my bam file?
Are you familiar with the process of variant calling?
A bit. I've only really done it for prokaryotic organisms with
snippy. Right now I'm trying the following:source activate vcftools_env && pv 201858321_S1.trimmomatic.GCA_000001405.15_GRCh38_no_alt_analysis_set.sorted.bam | samtools view -h | bcftools mpileup -f ../../db/genomes/human/GRCh38.p13/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --threads ${N_JOBS} - | gzip -c > 201858321_S1.trimmomatic.GCA_000001405.15_GRCh38_no_alt_analysis_set.sorted.vcf.gzI thought that
pvwould give me a progress bar (it's not) or else I wouldn't have usedstdinforbcftools.I understand the
mpileupwill generate avcffile but I'm wondering if there is a way to include annotations forSNPsso I can getrsidentifier flags? If not my plan was to convert to bed file and then compare bed files.I don't really understand the difference between the different VCF files on slide 2 here: https://www.ebi.ac.uk/sites/ebi.ac.uk/files/content.ebi.ac.uk/materials/2014/140217_AgriOmics/dan_bolser_snp_calling.pdf