Hello,
I have reads from a genome I modified by adding insertions and deletions (≃100bp). I'm using VG to map those reads to the reference genome (the same genome without my modifications). When calling, deletions are easily detected but most insertions aren't due to clipping.
e.g: Here's an insertion i added to the reference genome :
You can see on the image below that all reads have ends massively soft-clipped at the position where my insertion is supposed to be :
Is this normal ? How can I do to make VG detect those SV ? Thanks in advance
NB : I'm using VG version v1.31.0 "Caffaraccia", and here's the commands I use :
Graph construction
vg construct -r reference.fa -v samtools_call.vcf.gz -t 8 > graph.vg
Indexing
vg index -x index.xg graph.vg -t 8 ; vg index -g index.gcsa graph.vg -t 8
Mapping
vg map -x index.xg -g index.gcsa -f reads_1.fastq -f reads_2.fastq -t 8 > mapped.gam
Augment
vg augment graph.vg mapped.gam -A aug_mapped.gam -t 8 > aug_graph.vg
Reindexing
vg index aug_graph.vg -x aug_index.xg -t 8
Packing
vg pack -x aug_index.xg -g aug_mapped.gam -Q 5 -o aln_aug.pack -t 8
Calling
vg call aug_index.xg -k aln_aug.pack -t 8 > calls.vcf
And here's the commands I used to make the variant file used to construct the graph :
bwa mem -t8 reference.fa reads_1.fastq reads_2.fastq | samtools sort -o mapped_reads_sorted.bam
bcftools mpileup mapped_reads_sorted.bam -f reference.fa --output samtools_mpileup.vcf
cat samtools_mpileup.vcf | bcftools call -mv -Ov -o samtools_call.vcf
Hi,
I tried that already, but only a very small amounts of insertions are then detected (2 out of 24), and they got a low depth (≃3) while others SNPs got an average depth of 50.
I feel like it would be possible to detect more of them if I double my reads quantity, but it would be too unrealistic.