What is the best way to detect large indels ? I mapped reads of whole genome shotgun sequencing against the reference sequences.
What is the best way to detect large indels ? I mapped reads of whole genome shotgun sequencing against the reference sequences.
The best way to detect large deletions is to map with BBMap and call variants with the BBMap package's CallVariants tool. BBMap can also detect moderate-sized insertions (up to 60% of insert size, or so). For example:
bbmap.sh in=reads.fq out=mapped.sam ref=ref.fa maxindel=400k
callvariants.sh in=mapped.sam out=vars.vcf ref=ref.fa ploidy=2
For longer insertions that cannot be captured within a read pair it's probably better to use an indel-specific caller based on coverage and read pairing. But for insertions shorter than ~60% of insert size it's better to merge reads, map them, and call them with a normal variant-caller. I've used BBMerge -> BBMap -> CallVariants successfully with >200bp insertions from 2x100bp reads.
Note that I wrote BBMap and I am biased.
Some good tools are:
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.