Vg Call, not detecting SV due to soft-clipping
1
0
Entering edit mode
3.6 years ago
jcmouren • 0

Hello,

I have reads from a genome I modified by adding insertions and deletions (≃100bp). I'm using VG to map those reads to the reference genome (the same genome without my modifications). When calling, deletions are easily detected but most insertions aren't due to clipping.

e.g: Here's an insertion i added to the reference genome : enter image description here

You can see on the image below that all reads have ends massively soft-clipped at the position where my insertion is supposed to be : enter image description here

Is this normal ? How can I do to make VG detect those SV ? Thanks in advance

NB : I'm using VG version v1.31.0 "Caffaraccia", and here's the commands I use :

Graph construction

vg construct -r reference.fa -v samtools_call.vcf.gz -t 8 > graph.vg

Indexing

vg index -x index.xg graph.vg -t 8 ; vg index -g index.gcsa graph.vg -t 8

Mapping

vg map -x index.xg -g index.gcsa -f reads_1.fastq -f reads_2.fastq -t 8 > mapped.gam

Augment

vg augment graph.vg mapped.gam -A aug_mapped.gam -t 8 > aug_graph.vg

Reindexing

vg index aug_graph.vg -x aug_index.xg -t 8

Packing

vg pack -x aug_index.xg -g aug_mapped.gam -Q 5 -o aln_aug.pack -t 8

Calling

vg call aug_index.xg -k aln_aug.pack -t 8 > calls.vcf

And here's the commands I used to make the variant file used to construct the graph :

bwa mem -t8 reference.fa reads_1.fastq reads_2.fastq | samtools sort -o mapped_reads_sorted.bam

bcftools mpileup mapped_reads_sorted.bam -f reference.fa --output samtools_mpileup.vcf

cat samtools_mpileup.vcf | bcftools call -mv -Ov -o samtools_call.vcf

vgteam vg variation graph • 1.1k views
ADD COMMENT
0
Entering edit mode
3.6 years ago
glenn.hickey ▴ 520

You might try this following vg augment option

-S, --keep-softclips include softclips from input alignments (they are cut by default)

ADD COMMENT
0
Entering edit mode

Hi,

I tried that already, but only a very small amounts of insertions are then detected (2 out of 24), and they got a low depth (≃3) while others SNPs got an average depth of 50.

I feel like it would be possible to detect more of them if I double my reads quantity, but it would be too unrealistic.

ADD REPLY

Login before adding your answer.

Traffic: 1838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6